Courses

Structured learning pathways for computational and quantitative methods in language research

Great Court, The University of Queensland

LADAL Courses are curated sequences of tutorials, readings, and practical exercises for learners progressing from foundational knowledge to advanced skills. All courses are free, open, and built around reproducible R workflows. Whether you are a complete beginner or an experienced analyst, there is a pathway here for you.

By following a LADAL course, you will develop practical skills in R — data management, visualisation, statistics, and text analytics — and learn to apply them to real research questions in linguistics, the humanities, and the social sciences.

All Courses at a Glance

Short Course — 6–10 tutorials, self-paced, one focused topic Long Course — 12-week semester programme with weekly lectures, tutorials, and readings

Short Course

Introduction to Language Technology

👥 Linguists and humanities students

📚 6 tutorials

A conceptual and practical first introduction to language technology — from text processing and regex to OCR and NLP overview.

Level

Beginner

View course →

Short Course

Introduction to Corpus Linguistics

👥 Linguistics students and language teachers

📚 7 tutorials

Core corpus methods — concordancing, collocations, keyness, and frequency analysis — using R and reproducible workflows.

Level

Beginner

View course →

Short Course

Introduction to Text Analysis

👥 Humanities and social science students

📚 7 tutorials

From text processing basics to topic modelling, sentiment analysis, and network analysis of text collections.

Level

Beginner

View course →

Short Course

Data Visualisation for Linguists

👥 Linguists and language researchers

📚 6 tutorials

Publication-quality visualisation with ggplot2 — histograms, scatter plots, maps, Likert charts, and more.

Level

Introductory

View course →

Short Course

Introduction to Statistics

👥 Humanities and social science researchers

📚 7 tutorials

Statistical literacy from the ground up — descriptive statistics, hypothesis testing, t-tests, chi-square, and simple regression.

Level

Beginner

View course →

Short Course

Introduction to Learner Corpus Research

👥 Applied linguists and SLA researchers

📚 7 tutorials

Learner corpus methods — frequency comparison, collocations, lexical diversity, readability, and error analysis with ICLE and LOCNESS.

Level

Introductory

View course →

Short Course

Natural Language Processing with R

👥 Computational linguists and data scientists

📚 7 tutorials

NLP pipeline in R — preprocessing, TF-IDF, classification, NER, dependency parsing, and introduction to word embeddings.

Level

Intermediate

View course →

Long Course

Introduction to Digital Humanities with R

👥 Humanities researchers and students

📅 12 weeks · No background required

Full semester course: DH methods from data literacy and text processing through corpus analysis, topic modelling, networks, and mapping.

Level

Foundational

View course →

Long Course

Corpus Linguistics and Text Analysis with R

👥 Linguistics and applied linguistics students

📅 12 weeks · No background required

Corpus construction through concordancing, collocations, keywords, topic modelling, sentiment analysis, and network analysis.

Level

Foundational

View course →

Long Course

Introduction to Statistics in the Humanities

👥 Students and researchers, all disciplines

📅 12 weeks · No background required

From probability and descriptive statistics through regression and mixed-effects modelling, using R throughout.

Level

Foundational

View course →

Long Course

Advanced Statistics in the Humanities

👥 Researchers with prior statistics knowledge

📅 12 weeks · Basic stats + R required

Multivariate modelling, classification trees, random forests, clustering, correspondence analysis, and survey data analysis.

Level

Advanced

View course →

Short Courses

Introduction to Language Technology

A first introduction to language technology — what it is, what it can do, and how to get started

6 tutorials No background required Free

👥 Audience: Anyone curious about how computers process and analyse language

🎯 Aim: Conceptual and practical first introduction — from text processing and regex to OCR and NLP overview

Language technology encompasses the computational tools and methods used to analyse, generate, and interact with human language. This short course introduces learners to the landscape of language technology with hands-on practice in R. By the end, learners will understand the key methods and be equipped to explore more specialised pathways.

A conceptual map of language technology and its applications in linguistics and the humanities

Practical experience loading, cleaning, and exploring text data in R

Familiarity with regular expressions as a foundation for all text-analytic work

Hands-on experience with OCR for converting PDFs and scanned documents to text

An understanding of how corpus tools and NLP pipelines are constructed

Introduction to Text Analysis

What text analysis is, how it relates to corpus linguistics and NLP, and key concepts: corpus, token, type, frequency, and concordance.

Getting Started with R

Installing packages, loading data, working with vectors and data frames, and writing simple functions in R and RStudio.

Loading and Saving Data

Importing text from plain text files, CSV, Excel, and web URLs — and saving results for later use.

String Processing

Pattern matching, substitution, splitting, and the core string operations (using stringr) that underpin all text analysis.

Regular Expressions

Character classes, quantifiers, anchors, and look-arounds with worked linguistic examples — the pattern language for searching and transforming text.

Converting PDFs to Text

Extracting machine-readable text with pdftools (digital PDFs) and tesseract (scanned documents), including post-OCR spell-checking.

Introduction to Corpus Linguistics

Core corpus methods — concordancing, collocations, keyness — using R and reproducible workflows

7 tutorials No background required Free

👥 Audience: Linguistics students; language teachers; researchers new to corpus methods

🎯 Aim: Introduce concordancing, collocations, and keyness with hands-on R practice

Corpus linguistics uses large, principled collections of authentic text to investigate patterns of language use. This short course takes learners from a conceptual introduction through hands-on practice with the most widely used corpus methods, culminating in a case-study showcase integrating all techniques into a full corpus-based analysis.

What a corpus is and how corpus-based research differs from introspective approaches

Practical skills in frequency analysis, concordancing, collocation, and keyword extraction using R

Ability to design, conduct, and report a reproducible corpus-based study

Familiarity with key R packages: quanteda, tidytext, and related tools

Introduction to Text Analysis

Key concepts — corpus, concordance, collocation, keyword, frequency — used throughout the course.

Getting Started with R

First introduction to R and RStudio. Focus on the first four sections (up to Working with Tables).

String Processing

Essential string manipulation: pattern matching, substitution, tokenisation preparation, and whitespace management.

Concordancing (Keywords-in-Context)

KWIC concordance search and display in R — sorting, filtering, and interpreting concordance output.

Collocation and N-gram Analysis

Statistically significant collocations and n-gram sequences — PMI, log-likelihood, t-score, and visualisation.

Keyness and Keyword Analysis

Comparing two corpora to identify words that are statistically more or less frequent — the foundation of contrastive corpus analysis.

Corpus Linguistics with R

Capstone showcase: complete case studies integrating concordancing, frequency analysis, collocations, and keyness.

Introduction to Text Analysis

From text processing basics to topic modelling, sentiment analysis, and network analysis

7 tutorials No background required Free

👥 Audience: Humanities and social science students; researchers wanting computational approaches to text

🎯 Aim: Build practical R text analysis skills from cleaning and processing through to advanced methods

Text analysis uses computational methods to extract patterns, topics, sentiment, and relational structure from large collections of text. This course builds from foundational R skills through to topic modelling, sentiment analysis, and network analysis. By the end, learners will be able to apply a range of text-analytic methods to their own research texts.

An understanding of the major families of computational text analysis and their research applications

Practical R skills for cleaning, processing, and analysing text data

Hands-on experience with topic modelling, sentiment analysis, and network analysis

Ability to select the most appropriate method for a given research question

Introduction to Text Analysis

Overview of the field, key concepts, and situating text analysis within computational humanities research.

Getting Started with R

First introduction to R and RStudio. Focus on the first four sections.

String Processing

Core string manipulation skills for preparing raw text for analysis.

Practical Overview of Text Analytics Methods

Frequency analysis, TF-IDF, and basic classification workflows using R.

Topic Modelling

Latent Dirichlet Allocation (LDA) for discovering thematic structure in document collections — theory and R implementation.

Sentiment Analysis

Lexicon-based and machine-learning approaches to opinion and emotion extraction, including dictionary methods and valence shifting.

Network Analysis

Representing relational structure in textual and social data — node and edge construction, centrality measures, and visualisation.

Data Visualisation for Linguists

Publication-quality visualisation with ggplot2 — from frequency plots to maps

6 tutorials Basic R helpful Free

👥 Audience: Linguists and language researchers who want to communicate findings more effectively

🎯 Aim: Principled, publication-quality data visualisation with ggplot2 and linguistic data

Effective visualisation is one of the most transferable skills in quantitative research. This course builds from visualisation principles through the mechanics of ggplot2, covering the graph types most commonly needed in linguistics: frequency distributions, scatter plots, heat maps, geographic maps, and interactive visualisations. Special attention is given to colour accessibility, annotations, and formatting for publication.

A principled understanding of what makes a graph effective or misleading

Practical ggplot2 skills: geoms, scales, facets, themes, and annotations

Publication-quality static and interactive visualisations from linguistic data

Confidence choosing the right graph type for the right data and research question

Getting Started with R

Introduction to R with a focus on data structures and workflow needed for visualisation.

Introduction to Data Visualisation

Visualisation philosophy, perceptual principles, grammar of graphics, and when to use which chart type.

Descriptive Statistics

Summary statistics — means, medians, distributions, variance — that underpin most visualisations of linguistic data.

Data Visualisation with R

In-depth ggplot2: histograms, density plots, box plots, bar charts, scatter plots, and line graphs with worked linguistic examples.

Visualising and Analysing Survey Data

Cumulative density plots, diverging stacked bar charts, and Likert scale visualisation for questionnaire data.

Maps and Spatial Visualisation

Dialect maps, distribution maps, and choropleth maps of linguistic data using ggplot2 and sf.

Introduction to Statistics

Introduction to Statistics in the Humanities and Social Sciences

Statistical literacy and practical quantitative skills from the ground up, using R throughout

7 tutorials No background required Free

👥 Audience: Humanities and social science students and researchers with little or no prior statistics knowledge

🎯 Aim: Build statistical literacy from first principles through inferential testing in R

This course provides a conceptual and practical introduction to statistics for researchers whose background is in the humanities or social sciences. It begins with the philosophical foundations of quantitative reasoning and builds through descriptive statistics, visualisation, and inferential testing. By the end, learners will be able to conduct and interpret basic statistical analyses and communicate their results clearly.

Solid conceptual understanding of statistical thinking, probability, and hypothesis testing

Practical R skills for summarising, tabulating, visualising, and testing data

Ability to select, apply, and interpret t-tests, chi-square, correlation, and simple regression

Confidence reading and critically evaluating quantitative results in published research

Introduction to Quantitative Reasoning

Scientific thinking, the logic of hypothesis testing, and the role of quantitative methods in humanities and social science research.

Basic Concepts in Quantitative Research

Variables, data types, sampling, populations, reliability, and validity.

Getting Started with R

Introduction to R and RStudio. Focus on the first four sections.

Handling Tables in R

Importing, cleaning, reshaping, and summarising data frames using dplyr and tidyr.

Descriptive Statistics

Means, medians, standard deviations, distributions, and frequency tables in R.

Introduction to Data Visualisation

Visualisation principles and hands-on practice creating and customising graphs in R.

Basic Inferential Statistics

Hypothesis testing, p-values, confidence intervals, t-tests, chi-square, correlation, and simple linear regression with R exercises.

Introduction to Learner Corpus Research

Corpus methods for studying learner language — from frequency comparison to error analysis

7 tutorials Basic corpus linguistics helpful Free

👥 Audience: Applied linguists; SLA researchers; language teachers and test developers

🎯 Aim: Introduce LCR methods from corpus construction through to lexical diversity, readability, and error analysis

Learner corpus research uses collections of authentic language produced by second-language learners to investigate the structure, development, and distinctiveness of interlanguage. This course covers the major analytical methods — concordancing, frequency comparison, collocation, POS tagging, lexical diversity, and error analysis — using the ICLE and LOCNESS corpora as running examples.

What learner corpora are and how they differ from native-speaker corpora

Skills for comparing learner and native-speaker language quantitatively using R

Experience with lexical diversity measures, readability scores, and spelling error detection

Ability to design and interpret a basic learner corpus study in the context of SLA theory

Introduction to Text Analysis

Key concepts — corpus, frequency, concordance, collocation — underpinning learner corpus research.

Getting Started with R

Data structures and workflow for corpus analysis.

String Processing

Cleaning, normalising, splitting, and extracting character patterns from raw learner corpus texts.

Concordancing (Keywords-in-Context)

Extracting and inspecting KWIC concordances from learner texts to investigate how learners use specific words or constructions.

Collocation and N-gram Analysis

Comparing collocational patterns between learner and native-speaker corpora for studying collocational competence and L1 transfer.

Analysing Learner Language with R

Frequency comparison, POS tagging, lexical diversity, readability scores, and spelling error detection with ICLE and LOCNESS examples.

Keyness and Keyword Analysis

Words systematically over- or under-used by learners relative to native-speaker norms — one of the most informative methods in LCR.

Natural Language Processing with R

Text preprocessing, feature extraction, classification, NER, and transformer-based representations

7 tutorials Intermediate R required Free

👥 Audience: Computational linguists; data scientists working with language data

🎯 Prerequisite: Intermediate R skills; basic familiarity with descriptive statistics and simple regression

NLP builds on corpus and statistical methods to develop computational pipelines for understanding and generating language at scale. This course introduces the NLP workflow in R using real linguistic datasets, progressing from text preprocessing and feature engineering to supervised classification, topic models, and an introduction to working with large language model embeddings and APIs.

Clear understanding of the NLP pipeline from raw text to structured, analysable representations

Practical preprocessing skills: tokenisation, stopword removal, stemming, and lemmatisation

Experience building document-feature matrices and applying TF-IDF weighting

Hands-on practice with text classification, NER, and dependency parsing

Introduction to word embeddings and transformer-based representations

Introduction to Text Analysis

Situating NLP within corpus linguistics and computational linguistics.

String Processing

Foundation string manipulation — essential for all preprocessing steps in NLP pipelines.

Regular Expressions

Regex as the primary pattern-matching tool in text preprocessing and feature extraction.

Practical Overview of Text Analytics Methods

Document-feature matrices, TF-IDF, and basic classification workflows in R.

Topic Modelling

Probabilistic topic models as an unsupervised NLP method for discovering thematic structure.

Analysing Learner Language with R

POS tagging with udpipe, sequence analysis, and lexical diversity measures — key NLP tasks applied to real corpus data.

Network Analysis

Representing relational structure in language data — semantic networks, co-occurrence graphs, and social networks of linguistic interaction.

Long Courses

Introduction to Digital Humanities with R

Computational methods for humanistic inquiry — from data literacy through corpus analysis, networks, and mapping

12 weeks No background required Free

👥 Audience: Literature, history, cultural studies, linguistics, media studies students and researchers

🕐 Structure: 1h lecture + 1.5h tutorial per week

🎯 Aim: Design, conduct, and communicate a reproducible computational analysis of a humanities dataset

Digital humanities applies computational methods to humanistic inquiry: analysing large literary corpora, mapping cultural data geographically, tracing discourse patterns across historical archives, or modelling networks of social interaction. This 12-week course introduces students to the core DH toolkit through R, with weekly tutorials grounded in real humanities datasets. No prior programming experience is assumed.

Week 1What Is Digital Humanities?▾

Lecture topics

Overview of digital humanities — history, debates, and current landscape; relationship to corpus linguistics, text analysis, and data science; what counts as DH research.

LADAL tutorials

Introduction to Text Analysis

Readings

Burdick et al. (2012). Digital humanities. MIT Press, Ch. 1
Drucker (2021). The digital humanities coursebook. Routledge, Ch. 1

Week 2Reproducible Research and Data Management▾

Lecture topics

Why reproducibility matters in DH; introduction to R and RStudio; file organisation, project workflows, and version control basics.

LADAL tutorials

Reproducible Research Creating R Notebooks

Readings

Flanagan, J. (2025). Reproducibility, replicability, robustness, and generalizability in corpus linguistics. International Journal of Corpus Linguistics. doi:10.1075/ijcl.24113.fla

Week 3Getting Started with R▾

Lecture topics

R syntax, data types, vectors, and data frames; the tidyverse ecosystem; reading and writing data.

LADAL tutorials

Getting Started with R Loading and Saving Data

Readings

Wickham & Grolemund (2016). R for data science. Ch. 1–3. r4ds.had.co.nz

Week 4Working with Text Data▾

Lecture topics

How text is represented computationally; encoding, tokenisation, and the document-feature matrix; from raw text to structured data.

LADAL tutorials

String Processing Regular Expressions

Readings

Jockers, M. L. (2014). Text analysis with R for students of literature. Springer, Ch. 1–3

Week 5Building and Exploring Digital Corpora▾

Lecture topics

What is a corpus? Sampling principles, metadata, corpus design for humanities research; downloading and preparing digital texts.

LADAL tutorials

Downloading from Project Gutenberg Converting PDFs to Text

Readings

Biber, Conrad & Reppen (1998). Corpus linguistics. Cambridge University Press, Ch. 1–2

Week 6Frequency Analysis and Visualisation▾

Lecture topics

Zipf's law and frequency distributions; word counts, type-token ratios, and dispersion; principles of effective visualisation for humanities data.

LADAL tutorials

Introduction to Data Visualisation Descriptive Statistics

Readings

Jockers (2014), Ch. 4–5

Week 7Concordancing, Collocations, and Keywords▾

Lecture topics

Searching corpora; KWIC concordances and their interpretation; collocation and association measures; keyness and corpus comparison.

LADAL tutorials

Concordancing with R Keyness Analysis

Readings

Baker, P. (2006). Using corpora in discourse analysis. Continuum, Ch. 3–4

Week 8Topic Modelling and Thematic Analysis▾

Lecture topics

Latent Dirichlet Allocation (LDA); interpreting topics; applications in literary and historical research; limitations and critical perspectives.

LADAL tutorials

Topic Modelling

Readings

Blei, D. M. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77–84
Maier et al. (2021). Applying LDA topic modeling in communication research. In Computational methods for communication science (pp. 13–38). Routledge

Week 9Sentiment Analysis and Opinion Mining▾

Lecture topics

Lexicon-based and machine learning approaches to sentiment; subjectivity, valence, and emotion; applications in literary and media studies.

LADAL tutorials

Sentiment Analysis

Readings

Liu, B. (2012). Sentiment analysis and opinion mining. Ch. 1–2

Week 10Network Analysis for Humanities Research▾

Lecture topics

Graphs and networks as representations of humanistic data; character networks in fiction; citation and social networks; centrality and community detection.

LADAL tutorials

Network Analysis

Readings

Moretti, F. (2011). Network theory, plot analysis. New Left Review, 68, 80–102

Week 11Maps, Space, and Geographic Visualisation▾

Lecture topics

Spatial thinking in digital humanities; mapping literary geography, dialect distribution, and historical events; choropleth maps and point maps in R.

LADAL tutorials

Maps and Spatial Visualisation

Readings

Drucker (2021), Ch. 8

Week 12Project Workshop and Critical Reflections▾

Lecture topics

Critical DH — bias in corpora and algorithms, data ethics, representation, and positionality; communicating DH research; the future of digital humanities.

Tutorial

Student project presentations and peer feedback.

📚 Core Reading List

Baker, P. (2006). Using corpora in discourse analysis. Continuum.
Biber, D., Conrad, S., & Reppen, R. (1998). Corpus linguistics. Cambridge University Press.
Burdick, A., et al. (2012). Digital humanities. MIT Press.
Drucker, J. (2021). The digital humanities coursebook. Routledge.
Flanagan, J. (2025). Reproducibility in corpus linguistics. International Journal of Corpus Linguistics. doi:10.1075/ijcl.24113.fla
Jockers, M. L. (2014). Text analysis with R for students of literature. Springer.
Liu, B. (2012). Sentiment analysis and opinion mining. Synthesis Lectures on Human Language Technologies, 5(1), 1–167.
Wickham, H., & Grolemund, G. (2016). R for data science. O'Reilly. r4ds.had.co.nz

Corpus Linguistics and Text Analysis with R

Introduction to Corpus Linguistics and Text Analysis with R

Corpus construction through concordancing, keywords, topic modelling, and network analysis

12 weeks No background required Free

👥 Audience: Students in linguistics, applied linguistics, translation, communication, and literary studies

🕐 Structure: 1h lecture + 1.5h tutorial per week

🎯 Aim: Introduce corpus-based methods and hands-on text analysis in R

Week 1Introduction to Corpus Linguistics and Text Analytics▾

Lecture topics

What is corpus linguistics? Key concepts, history, and applications; corpus vs. introspective and experimental methods; overview of the course.

LADAL tutorials

Introduction to Text Analysis

Readings

McEnery & Hardie (2012). Corpus linguistics: Method, theory and practice. CUP, Ch. 1–2

Week 2Working with Digital Data and Reproducibility▾

Lecture topics

Principles of reproducible research; introduction to R Notebooks; file management and workflow.

LADAL tutorials

Reproducible Research Creating R Notebooks

Readings

Flanagan (2025). Reproducibility in corpus linguistics. doi:10.1075/ijcl.24113.fla

Week 3Getting Started with R▾

Lecture topics

R and RStudio; installing packages; basic syntax; workflow setup.

LADAL tutorials

Why R for Corpus Linguistics Getting Started with R Loading and Saving Data

Readings

Wickham & Grolemund (2016), Ch. 1–3

Week 4Corpus Compilation and Preparation▾

Lecture topics

Types of corpora; sampling principles and representativeness; metadata and annotation; legal and ethical issues in corpus construction.

LADAL tutorials

Downloading from Project Gutenberg

Readings

Biber, Conrad & Reppen (1998), Ch. 1–2

Week 5Frequency and Dispersion▾

Lecture topics

Counting words and n-grams; Zipf's law; normalised frequencies; dispersion measures and why they matter; type-token ratio.

LADAL tutorials

Handling Tables in R

Readings

McEnery & Hardie (2012), Ch. 3
Gries (2024). Frequency, dispersion, association, and keyness. Ch. 1–2

Week 6Concordancing and KWIC▾

Lecture topics

Searching corpora; concordance displays and their interpretation; sorting and filtering; from examples to patterns.

LADAL tutorials

Concordancing with R

Readings

Baker (2006), Ch. 3

Week 7Collocations and N-grams▾

Lecture topics

Association measures (MI, t-score, log-likelihood, Dice); phraseology and formulaic sequences; n-gram extraction and analysis.

LADAL tutorials

Collocation and N-gram Analysis

Readings

Gries (2024), Ch. 2

Week 8Keywords and Keyness▾

Lecture topics

Reference corpora and keyness; log-likelihood and log ratio as keyness measures; interpretation and applications in discourse analysis.

LADAL tutorials

Keyness and Keyword Analysis

Readings

Gries (2024), Ch. 3

Week 9Advanced Text Analytics I — Topic Modelling▾

Lecture topics

Unsupervised text classification; LDA and its assumptions; interpreting and validating topic models; applications in linguistics and discourse analysis.

LADAL tutorials

Topic Modelling

Readings

Maier et al. (2021). Applying LDA topic modeling in communication research (pp. 13–38)

Week 10Advanced Text Analytics II — Sentiment and Network Analysis▾

Lecture topics

Sentiment lexicons; opinion mining; co-occurrence networks and semantic networks from corpus data.

LADAL tutorials

Sentiment Analysis Network Analysis

Readings

Liu (2012), Ch. 1–2

Week 11Case Studies in Corpus Linguistics▾

Lecture topics

Corpus-based studies of grammar, lexis, and discourse; from method to interpretation; writing up corpus research.

LADAL tutorials

Corpus Linguistics with R

Readings

Baker (2006), Ch. 7

Week 12Project Workshop and Presentations▾

Lecture topics

Ethics in corpus research; future directions; communicating corpus findings to non-specialist audiences.

Tutorial

Student project work.

📚 Core Reading List

Baker, P. (2006). Using corpora in discourse analysis. Continuum.
Biber, D., Conrad, S., & Reppen, R. (1998). Corpus linguistics. Cambridge University Press.
Flanagan, J. (2025). Reproducibility in corpus linguistics. International Journal of Corpus Linguistics. doi:10.1075/ijcl.24113.fla
Gries, S. T. (2024). Frequency, dispersion, association, and keyness (Studies in Corpus Linguistics, Vol. 115). John Benjamins.
Liu, B. (2012). Sentiment analysis and opinion mining. Synthesis Lectures on Human Language Technologies, 5(1), 1–167.
McEnery, T., & Hardie, A. (2012). Corpus linguistics: Method, theory and practice. Cambridge University Press.
Wickham, H., & Grolemund, G. (2016). R for data science. r4ds.had.co.nz

Introduction to Statistics in the Humanities

Introduction to Statistics in the Humanities and Social Sciences

Probability and descriptive statistics through regression and mixed-effects modelling, using R throughout

12 weeks No background required Free

👥 Audience: Students and researchers in linguistics, psychology, education, sociology, and related fields

🕐 Structure: 1h lecture + 1.5h tutorial per week

🎯 Aim: Practical and conceptual foundation in quantitative methods, no prior knowledge assumed

Week 1Introduction to Quantitative Research▾

Lecture topics

The role of quantitative methods in humanities and social sciences; an overview of statistical thinking; the research cycle; types of research questions.

LADAL tutorials

Introduction to Quantitative Reasoning

Readings

Field, Miles & Field (2012). Discovering statistics using R. Ch. 1
Baayen (2008). Analyzing linguistic data. Ch. 1

Week 2Basic Concepts in Quantitative Research▾

Lecture topics

Data types and measurement scales; variables, operationalisation, and construct validity; sampling and representativeness; reliability and validity.

LADAL tutorials

Basic Concepts in Quantitative Research

Readings

Gries (2013). Statistics for linguists. Ch. 1–2

Week 3Getting Started with R — Part 1▾

Lecture topics

Introduction to R and RStudio; installing and loading packages; basic syntax and data structures; the tidyverse ecosystem.

LADAL tutorials

Getting Started with R

Readings

Wickham & Grolemund (2016), Ch. 1–3

Week 4Loading and Handling Data▾

Lecture topics

Importing datasets from CSV, Excel, and text files; data cleaning and transformation; working with factors and missing values.

LADAL tutorials

Loading and Saving Data Handling Tables in R

Readings

Baayen (2008), Ch. 2

Week 5R Basics for Statistical Analysis▾

Lecture topics

Vectors, factors, data frames, indexing, and subsetting; writing functions; applying operations across groups with dplyr.

LADAL tutorials

Getting Started with R (advanced sections)

Readings

Field, Miles & Field (2012), Ch. 2–3

Week 6Descriptive Statistics▾

Lecture topics

Measures of central tendency and dispersion; frequency distributions; skewness and kurtosis; the normal distribution; summarising grouped data.

LADAL tutorials

Descriptive Statistics

Readings

Baayen (2008), Ch. 3
Winter (2019). Statistics for linguists. Ch. 2

Week 7Visualising Data▾

Lecture topics

Principles of effective visualisation; histograms, box plots, scatter plots, and bar charts; ggplot2 grammar of graphics.

LADAL tutorials

Data Visualisation with R

Readings

Wickham & Grolemund (2016), Ch. 14

Week 8Hypothesis Testing and Power Analysis▾

Lecture topics

The logic of null hypothesis significance testing; t-tests, ANOVA; p-values and their interpretation; effect sizes; statistical power and sample size planning.

LADAL tutorials

Basic Inferential Statistics

Readings

Field, Miles & Field (2012), Ch. 4
Gries (2013), Ch. 3

Week 9Correlation and Simple Regression▾

Lecture topics

Pearson and Spearman correlation; simple linear regression; interpreting intercepts and slopes; assumptions and diagnostics.

LADAL tutorials

Regression Analysis

Readings

Baayen (2008), Ch. 4

Week 10Multiple Regression and Model Diagnostics▾

Lecture topics

Multiple regression; multicollinearity; residual analysis; model comparison with AIC/BIC; stepwise and theory-driven model building.

LADAL tutorials

Regression Analysis (advanced)

Readings

Winter (2019), Ch. 5

Week 11Logistic Regression▾

Lecture topics

Binary and ordinal outcomes; logistic regression model fitting and interpretation; odds ratios and predicted probabilities; the proportional odds model.

LADAL tutorials

Regression Analysis Visualising and Analysing Survey Data

Readings

Baayen (2008), Ch. 5
Winter (2019), Ch. 6

Week 12Mixed-Effects Models▾

Lecture topics

Why mixed effects? Random intercepts and random slopes; by-participant and by-item random effects; fitting and interpreting mixed models with lme4.

LADAL tutorials

Mixed-Effects Models

Readings

Gries (2013), Ch. 6
Field, Miles & Field (2012), Ch. 12

📚 Core Reading List

Baayen, R. H. (2008). Analyzing linguistic data. Cambridge University Press.
Field, A., Miles, J., & Field, Z. (2012). Discovering statistics using R. Sage.
Gries, S. T. (2013). Statistics for linguists. De Gruyter Mouton.
Wickham, H., & Grolemund, G. (2016). R for data science. O'Reilly. r4ds.had.co.nz
Winter, B. (2019). Statistics for linguists: An introduction using R. Routledge.

Advanced Statistics in the Humanities and Social Sciences

Multivariate modelling, classification, clustering, and survey data analysis using R

12 weeks Basic stats + R required Free

👥 Audience: Students and researchers with prior knowledge of basic statistics

🕐 Structure: 1h lecture + 1.5h tutorial per week

🎯 Prerequisite: Familiarity with t-tests, regression, and intermediate R skills

Week 1Advanced Data Management and Reproducible Workflows▾

Lecture topics

Organising complex datasets; reproducibility in advanced research; scripting and automating analysis pipelines; version control with Git.

LADAL tutorials

Reproducible Research Creating R Notebooks

Readings

Flanagan (2025), Ch. 1

Week 2Review of Descriptive and Inferential Statistics▾

Lecture topics

Quick review of key concepts: distributions, t-tests, correlations, confidence intervals, effect sizes, and power.

LADAL tutorials

Descriptive Statistics Basic Inferential Statistics

Readings

Field, Miles & Field (2012), Ch. 1–4

Week 3Advanced Regression — Multiple and Hierarchical Models▾

Lecture topics

Multiple regression; interaction terms; hierarchical (nested) models; mixed-effects models with random intercepts and slopes.

LADAL tutorials

Regression Analysis Mixed-Effects Models

Readings

Baayen (2008), Ch. 4–5
Winter (2019), Ch. 5–6

Week 4Logistic Regression and Generalised Linear Models▾

Lecture topics

Binary and multinomial outcomes; model fitting and interpretation; goodness-of-fit; GLMs as a unified framework.

LADAL tutorials

Regression Analysis

Readings

Winter (2019), Ch. 6

Week 5Classification — Decision Trees▾

Lecture topics

Decision trees; recursive partitioning; overfitting and pruning; interpreting tree outputs; applications in linguistic classification problems.

LADAL tutorials

Tree-Based Models

Readings

Gries (2013), Ch. 6

Week 6Classification — Random Forests and Ensemble Methods▾

Lecture topics

Ensemble learning; bagging and boosting; random forests; variable importance; improving prediction accuracy and generalisability.

LADAL tutorials

Tree-Based Models

Readings

James, Witten, Hastie & Tibshirani (2021). An introduction to statistical learning. Ch. 8

Week 7Clustering and Correspondence Analysis▾

Lecture topics

Unsupervised classification; k-means and hierarchical clustering; choosing the number of clusters; correspondence analysis for categorical data.

LADAL tutorials

Cluster and Correspondence Analysis

Readings

Gries (2013), Ch. 7

Week 8Survey and Questionnaire Data Analysis I▾

Lecture topics

Preparing survey data; dealing with missing values; Likert scales and their properties; descriptive analysis and visualisation of survey items.

LADAL tutorials

Visualising and Analysing Survey Data

Readings

Field, Miles & Field (2012), Ch. 10
Baayen (2008), Ch. 6

Week 9Survey and Questionnaire Data Analysis II▾

Lecture topics

Reliability (Cronbach's α, McDonald's ω); factor analysis and scale validation; cross-tabulations and chi-square; ordinal regression for Likert outcomes.

LADAL tutorials

Visualising and Analysing Survey Data

Readings

Field, Miles & Field (2012), Ch. 11

Week 10Dimension Reduction and Multivariate Techniques▾

Lecture topics

Principal Component Analysis (PCA); multidimensional scaling (MDS); detecting latent variables; applications to linguistic and social science data.

LADAL tutorials

Dimension Reduction Methods

Readings

Gries (2013), Ch. 8

Week 11Model Evaluation, Diagnostics, and Advanced Visualisation▾

Lecture topics

Residual analysis and outlier detection; model comparison and selection criteria (AIC, BIC, cross-validation); visualisation for multivariate data.

LADAL tutorials

Data Visualisation with R Regression Analysis

Readings

Winter (2019), Ch. 7

Week 12Applications and Student Mini-Projects▾

Lecture topics

Integrating advanced methods into humanities and social science research; ethical considerations; communicating complex statistical results; reproducibility revisited.

Tutorial

Student project work applying classification, clustering, and survey analysis to real datasets.

Readings

Baayen (2008), Ch. 7
Field, Miles & Field (2012), Ch. 12

📚 Core Reading List

Baayen, R. H. (2008). Analyzing linguistic data. Cambridge University Press.
Field, A., Miles, J., & Field, Z. (2012). Discovering statistics using R. Sage.
Flanagan, J. (2025). Reproducibility in corpus linguistics. International Journal of Corpus Linguistics. doi:10.1075/ijcl.24113.fla
Gries, S. T. (2013). Statistics for linguists. De Gruyter Mouton.
James, G., Witten, D., Hastie, T., & Tibshirani, R. (2021). An introduction to statistical learning. Springer.
Winter, B. (2019). Statistics for linguists: An introduction using R. Routledge.

--- title: "Courses" subtitle: "Structured learning pathways for computational and quantitative methods in language research" toc: true toc-depth: 2 --- ```{=html} <style> /* ── Course catalogue cards ──────────────────────────────────────── */ .course-grid { display: grid; grid-template-columns: repeat(auto-fill, minmax(280px, 1fr)); gap: 16px; margin: 1.5rem 0; } .course-card { background: #fff; border: 1px solid #e8e4f0; border-radius: 8px; overflow: hidden; display: flex; flex-direction: column; transition: box-shadow 0.2s; } .course-card:hover { box-shadow: 0 4px 16px rgba(81,36,122,0.12); } .course-card-header { padding: 16px 18px 12px; border-bottom: 1px solid #f0eaf7; } .course-format-tag { display: inline-flex; align-items: center; gap: 5px; font-size: 0.7rem; font-weight: 700; padding: 2px 9px; border-radius: 20px; margin-bottom: 8px; text-transform: uppercase; letter-spacing: 0.04em; } .tag-short { background: #e6f0fb; color: #2a5ea8; } .tag-long { background: #f0e6fb; color: #6b1e9c; } .course-card-header h4 { margin: 0 0 4px 0; font-size: 0.9rem; font-weight: 700; color: #51247A; line-height: 1.3; } .course-card-header h4 a { color: inherit; text-decoration: none; } .course-card-header h4 a:hover { color: #00A2C7; } .course-card-body { padding: 12px 18px 14px; flex: 1; display: flex; flex-direction: column; gap: 8px; } .course-meta-row { display: flex; align-items: center; gap: 6px; font-size: 0.78rem; color: #444; } .course-meta-row .meta-icon { flex-shrink: 0; font-size: 0.85rem; } .course-card p.cc-desc { margin: 0; font-size: 0.82rem; color: #222; line-height: 1.5; } .difficulty-bar { display: flex; align-items: center; gap: 8px; margin-top: 4px; } .difficulty-bar .db-label { font-size: 0.72rem; color: #888; width: 70px; flex-shrink: 0; } .difficulty-dots { display: flex; gap: 3px; } .difficulty-dots span { width: 8px; height: 8px; border-radius: 50%; background: #e0d8eb; } .difficulty-dots span.active { background: #51247A; } .course-card-footer { padding: 10px 18px; border-top: 1px solid #f0eaf7; } .cc-link { font-size: 0.78rem; font-weight: 600; color: #51247A; text-decoration: none; } .cc-link:hover { color: #00A2C7; } /* ── Section header strips ───────────────────────────────────────── */ .section-banner { background: linear-gradient(135deg, #51247A 0%, #3d1a5e 100%); color: white; border-radius: 8px; padding: 22px 28px; margin: 2rem 0 1.5rem 0; display: flex; align-items: center; gap: 18px; } .section-banner .sb-icon { font-size: 2rem; flex-shrink: 0; } .section-banner h2 { margin: 0 0 4px 0; color: white; font-size: 1.1rem; } .section-banner p { margin: 0; color: #e8d8f8; font-size: 0.875rem; line-height: 1.5; } /* ── Individual course blocks ────────────────────────────────────── */ .course-block { border: 1px solid #e8e4f0; border-radius: 8px; overflow: hidden; margin-bottom: 2rem; } .course-block-header { background: #51247A; padding: 20px 28px; display: flex; align-items: flex-start; justify-content: space-between; gap: 16px; flex-wrap: wrap; } .course-block-header.aqua { background: linear-gradient(135deg, #007a9a 0%, #005c75 100%); } .course-block-header.magenta { background: linear-gradient(135deg, #7a2272 0%, #5c1856 100%); } .course-block-header.blue { background: linear-gradient(135deg, #2a5ea8 0%, #1d4580 100%); } .course-block-header.green { background: linear-gradient(135deg, #2a7a3a 0%, #1e5c2c 100%); } .course-block-header.dark { background: linear-gradient(135deg, #2d2540 0%, #1a1428 100%); } .cbh-info h3 { margin: 0 0 6px 0; color: white; font-size: 1.05rem; } .cbh-info p { margin: 0; color: #e8d8f8; font-size: 0.82rem; line-height: 1.5; } .cbh-badges { display: flex; flex-wrap: wrap; gap: 6px; flex-shrink: 0; } .cbh-badge { background: rgba(255,255,255,0.18); color: white; font-size: 0.7rem; padding: 3px 10px; border-radius: 20px; font-weight: 600; white-space: nowrap; } .cbh-badge.free-badge { background: rgba(46,168,54,0.5); } .course-block-meta { background: #faf8fd; padding: 14px 28px; display: flex; flex-wrap: wrap; gap: 20px; border-bottom: 1px solid #e8e4f0; } .cbm-item { font-size: 0.82rem; color: #333; display: flex; align-items: center; gap: 6px; } .cbm-item strong { color: #111; } .course-block-body { padding: 20px 28px; } .course-block-body > p { font-size: 0.875rem; color: #444; line-height: 1.65; margin: 0 0 16px 0; } /* ── Outcomes list ───────────────────────────────────────────────── */ .outcomes-grid { display: grid; grid-template-columns: repeat(auto-fit, minmax(220px, 1fr)); gap: 8px; margin: 0 0 20px 0; } .outcome-item { background: #f7f5fb; border-radius: 4px; padding: 9px 12px; font-size: 0.82rem; color: #444; line-height: 1.5; display: flex; gap: 8px; align-items: flex-start; } .outcome-item::before { content: "✓"; color: #51247A; font-weight: 700; flex-shrink: 0; margin-top: 0px; } /* ── Tutorial sequence ───────────────────────────────────────────── */ .tutorial-sequence { display: flex; flex-direction: column; gap: 0; margin: 16px 0; border: 1px solid #e8e4f0; border-radius: 6px; overflow: hidden; } .tseq-item { display: flex; align-items: flex-start; gap: 14px; padding: 13px 16px; border-bottom: 1px solid #f0eaf7; background: #fff; transition: background 0.15s; } .tseq-item:last-child { border-bottom: none; } .tseq-item:hover { background: #faf8fd; } .tseq-num { background: #51247A; color: white; font-size: 0.72rem; font-weight: 700; width: 22px; height: 22px; border-radius: 50%; display: flex; align-items: center; justify-content: center; flex-shrink: 0; margin-top: 1px; } .tseq-info h5 { margin: 0 0 2px 0; font-size: 0.875rem; font-weight: 700; } .tseq-info h5 a { color: #51247A; text-decoration: none; } .tseq-info h5 a:hover { color: #00A2C7; } .tseq-info p { margin: 0; font-size: 0.8rem; color: #666; line-height: 1.5; } /* ── Week accordion ──────────────────────────────────────────────── */ .week-accordion { margin: 16px 0; } .week-item { border: 1px solid #e8e4f0; border-radius: 6px; overflow: hidden; margin-bottom: 6px; } .week-item summary { list-style: none; padding: 12px 18px; cursor: pointer; background: #fff; display: flex; align-items: center; gap: 12px; user-select: none; } .week-item summary::-webkit-details-marker { display: none; } .week-item[open] summary { background: #f7f5fb; border-bottom: 1px solid #e8e4f0; } .week-item summary:hover { background: #f7f5fb; } .week-num-badge { background: #51247A; color: white; font-size: 0.7rem; font-weight: 700; padding: 2px 8px; border-radius: 3px; flex-shrink: 0; white-space: nowrap; } .week-title { font-size: 0.875rem; font-weight: 600; color: #333; flex: 1; } .week-chevron { font-size: 0.7rem; color: #999; transition: transform 0.2s; flex-shrink: 0; } .week-item[open] .week-chevron { transform: rotate(180deg); } .week-body { padding: 14px 18px; background: #faf8fd; font-size: 0.85rem; color: #444; line-height: 1.6; } .week-body p { margin: 0 0 8px 0; } .week-body ul { margin: 4px 0 10px 16px; padding: 0; } .week-body ul li { margin-bottom: 3px; } .week-body .week-section-label { font-size: 0.7rem; font-weight: 700; text-transform: uppercase; letter-spacing: 0.06em; color: #51247A; margin: 10px 0 4px 0; } .week-body .week-section-label:first-child { margin-top: 0; } .week-tutorial-links { display: flex; flex-wrap: wrap; gap: 6px; margin: 4px 0 10px 0; } .wtl-link { background: #f0eaf7; color: #51247A !important; font-size: 0.75rem; padding: 3px 10px; border-radius: 20px; text-decoration: none !important; font-weight: 500; border: 1px solid #d8cce8; } .wtl-link:hover { background: #51247A; color: white !important; } /* ── Reading list ────────────────────────────────────────────────── */ .reading-list { background: #fff; border: 1px solid #e8e4f0; border-radius: 6px; padding: 16px 20px; margin: 16px 0; } .reading-list h5 { margin: 0 0 10px 0; font-size: 0.85rem; color: #51247A; font-weight: 700; } .reading-list ul { margin: 0; padding-left: 18px; } .reading-list ul li { font-size: 0.82rem; color: #444; line-height: 1.6; margin-bottom: 5px; } .reading-list ul li a { color: #51247A; } /* ── CTA banner ──────────────────────────────────────────────────── */ .cta-banner { background: linear-gradient(135deg, #51247A 0%, #3d1a5e 100%); color: white; border-radius: 8px; padding: 32px 40px; display: flex; align-items: center; justify-content: space-between; flex-wrap: wrap; gap: 20px; margin: 2rem 0; } .cta-banner h3 { margin: 0 0 6px 0; color: white; font-size: 1.2rem; } .cta-banner p { margin: 0; color: #e8d8f8; font-size: 0.875rem; } .cta-actions { display: flex; gap: 10px; flex-wrap: wrap; } .btn-aqua { background: #00A2C7; color: white !important; padding: 10px 20px; border-radius: 4px; font-weight: 600; font-size: 0.875rem; text-decoration: none !important; white-space: nowrap; } .btn-aqua:hover { background: #008faf; } .btn-ghost-white { background: transparent; color: white !important; padding: 10px 20px; border-radius: 4px; font-weight: 600; font-size: 0.875rem; text-decoration: none !important; border: 2px solid rgba(255,255,255,0.5); white-space: nowrap; } .btn-ghost-white:hover { border-color: white; background: rgba(255,255,255,0.1); } @media (max-width: 640px) { .cta-banner { flex-direction: column; } .course-block-header { flex-direction: column; } .cbh-badges { align-self: flex-start; } } </style> ``` ![Great Court, The University of Queensland](/images/uq1.webp){width="100%" height="200px" loading="lazy" fetchpriority="high"} LADAL Courses are curated sequences of tutorials, readings, and practical exercises for learners progressing from foundational knowledge to advanced skills. All courses are free, open, and built around reproducible R workflows. Whether you are a complete beginner or an experienced analyst, there is a pathway here for you. By following a LADAL course, you will develop practical skills in R — data management, visualisation, statistics, and text analytics — and learn to apply them to real research questions in linguistics, the humanities, and the social sciences. --- ## All Courses at a Glance {#overview} ```{=html} <div style="display:flex;gap:14px;flex-wrap:wrap;margin:0 0 1.5rem 0;font-size:0.82rem;align-items:center;"> <span><span class="course-format-tag tag-short">Short Course</span> — 6–10 tutorials, self-paced, one focused topic</span> <span><span class="course-format-tag tag-long">Long Course</span> — 12-week semester programme with weekly lectures, tutorials, and readings</span> </div> <div class="course-grid"> <div class="course-card"> <div class="course-card-header"> <span class="course-format-tag tag-short">Short Course</span> <h4><a href="#langtech">Introduction to Language Technology</a></h4> </div> <div class="course-card-body"> <div class="course-meta-row"><span class="meta-icon">👥</span> Linguists and humanities students</div> <div class="course-meta-row"><span class="meta-icon">📚</span> 6 tutorials</div> <p class="cc-desc">A conceptual and practical first introduction to language technology — from text processing and regex to OCR and NLP overview.</p> <div class="difficulty-bar"> <span class="db-label">Level</span> <div class="difficulty-dots"> <span class="active"></span><span></span><span></span><span></span> </div> <span style="font-size:0.72rem;color:#888;">Beginner</span> </div> </div> <div class="course-card-footer"><a href="#langtech" class="cc-link">View course →</a></div> </div> <div class="course-card"> <div class="course-card-header"> <span class="course-format-tag tag-short">Short Course</span> <h4><a href="#corpusling-short">Introduction to Corpus Linguistics</a></h4> </div> <div class="course-card-body"> <div class="course-meta-row"><span class="meta-icon">👥</span> Linguistics students and language teachers</div> <div class="course-meta-row"><span class="meta-icon">📚</span> 7 tutorials</div> <p class="cc-desc">Core corpus methods — concordancing, collocations, keyness, and frequency analysis — using R and reproducible workflows.</p> <div class="difficulty-bar"> <span class="db-label">Level</span> <div class="difficulty-dots"> <span class="active"></span><span></span><span></span><span></span> </div> <span style="font-size:0.72rem;color:#888;">Beginner</span> </div> </div> <div class="course-card-footer"><a href="#corpusling-short" class="cc-link">View course →</a></div> </div> <div class="course-card"> <div class="course-card-header"> <span class="course-format-tag tag-short">Short Course</span> <h4><a href="#textanalysis-short">Introduction to Text Analysis</a></h4> </div> <div class="course-card-body"> <div class="course-meta-row"><span class="meta-icon">👥</span> Humanities and social science students</div> <div class="course-meta-row"><span class="meta-icon">📚</span> 7 tutorials</div> <p class="cc-desc">From text processing basics to topic modelling, sentiment analysis, and network analysis of text collections.</p> <div class="difficulty-bar"> <span class="db-label">Level</span> <div class="difficulty-dots"> <span class="active"></span><span></span><span></span><span></span> </div> <span style="font-size:0.72rem;color:#888;">Beginner</span> </div> </div> <div class="course-card-footer"><a href="#textanalysis-short" class="cc-link">View course →</a></div> </div> <div class="course-card"> <div class="course-card-header"> <span class="course-format-tag tag-short">Short Course</span> <h4><a href="#dataviz-short">Data Visualisation for Linguists</a></h4> </div> <div class="course-card-body"> <div class="course-meta-row"><span class="meta-icon">👥</span> Linguists and language researchers</div> <div class="course-meta-row"><span class="meta-icon">📚</span> 6 tutorials</div> <p class="cc-desc">Publication-quality visualisation with ggplot2 — histograms, scatter plots, maps, Likert charts, and more.</p> <div class="difficulty-bar"> <span class="db-label">Level</span> <div class="difficulty-dots"> <span class="active"></span><span class="active"></span><span></span><span></span> </div> <span style="font-size:0.72rem;color:#888;">Introductory</span> </div> </div> <div class="course-card-footer"><a href="#dataviz-short" class="cc-link">View course →</a></div> </div> <div class="course-card"> <div class="course-card-header"> <span class="course-format-tag tag-short">Short Course</span> <h4><a href="#stats-short">Introduction to Statistics</a></h4> </div> <div class="course-card-body"> <div class="course-meta-row"><span class="meta-icon">👥</span> Humanities and social science researchers</div> <div class="course-meta-row"><span class="meta-icon">📚</span> 7 tutorials</div> <p class="cc-desc">Statistical literacy from the ground up — descriptive statistics, hypothesis testing, t-tests, chi-square, and simple regression.</p> <div class="difficulty-bar"> <span class="db-label">Level</span> <div class="difficulty-dots"> <span class="active"></span><span></span><span></span><span></span> </div> <span style="font-size:0.72rem;color:#888;">Beginner</span> </div> </div> <div class="course-card-footer"><a href="#stats-short" class="cc-link">View course →</a></div> </div> <div class="course-card"> <div class="course-card-header"> <span class="course-format-tag tag-short">Short Course</span> <h4><a href="#lcr-short">Introduction to Learner Corpus Research</a></h4> </div> <div class="course-card-body"> <div class="course-meta-row"><span class="meta-icon">👥</span> Applied linguists and SLA researchers</div> <div class="course-meta-row"><span class="meta-icon">📚</span> 7 tutorials</div> <p class="cc-desc">Learner corpus methods — frequency comparison, collocations, lexical diversity, readability, and error analysis with ICLE and LOCNESS.</p> <div class="difficulty-bar"> <span class="db-label">Level</span> <div class="difficulty-dots"> <span class="active"></span><span class="active"></span><span></span><span></span> </div> <span style="font-size:0.72rem;color:#888;">Introductory</span> </div> </div> <div class="course-card-footer"><a href="#lcr-short" class="cc-link">View course →</a></div> </div> <div class="course-card"> <div class="course-card-header"> <span class="course-format-tag tag-short">Short Course</span> <h4><a href="#nlp-short">Natural Language Processing with R</a></h4> </div> <div class="course-card-body"> <div class="course-meta-row"><span class="meta-icon">👥</span> Computational linguists and data scientists</div> <div class="course-meta-row"><span class="meta-icon">📚</span> 7 tutorials</div> <p class="cc-desc">NLP pipeline in R — preprocessing, TF-IDF, classification, NER, dependency parsing, and introduction to word embeddings.</p> <div class="difficulty-bar"> <span class="db-label">Level</span> <div class="difficulty-dots"> <span class="active"></span><span class="active"></span><span class="active"></span><span></span> </div> <span style="font-size:0.72rem;color:#888;">Intermediate</span> </div> </div> <div class="course-card-footer"><a href="#nlp-short" class="cc-link">View course →</a></div> </div> <div class="course-card"> <div class="course-card-header"> <span class="course-format-tag tag-long">Long Course</span> <h4><a href="#dh-long">Introduction to Digital Humanities with R</a></h4> </div> <div class="course-card-body"> <div class="course-meta-row"><span class="meta-icon">👥</span> Humanities researchers and students</div> <div class="course-meta-row"><span class="meta-icon">📅</span> 12 weeks · No background required</div> <p class="cc-desc">Full semester course: DH methods from data literacy and text processing through corpus analysis, topic modelling, networks, and mapping.</p> <div class="difficulty-bar"> <span class="db-label">Level</span> <div class="difficulty-dots"> <span class="active"></span><span class="active"></span><span></span><span></span> </div> <span style="font-size:0.72rem;color:#888;">Foundational</span> </div> </div> <div class="course-card-footer"><a href="#dh-long" class="cc-link">View course →</a></div> </div> <div class="course-card"> <div class="course-card-header"> <span class="course-format-tag tag-long">Long Course</span> <h4><a href="#corpusling-long">Corpus Linguistics and Text Analysis with R</a></h4> </div> <div class="course-card-body"> <div class="course-meta-row"><span class="meta-icon">👥</span> Linguistics and applied linguistics students</div> <div class="course-meta-row"><span class="meta-icon">📅</span> 12 weeks · No background required</div> <p class="cc-desc">Corpus construction through concordancing, collocations, keywords, topic modelling, sentiment analysis, and network analysis.</p> <div class="difficulty-bar"> <span class="db-label">Level</span> <div class="difficulty-dots"> <span class="active"></span><span class="active"></span><span></span><span></span> </div> <span style="font-size:0.72rem;color:#888;">Foundational</span> </div> </div> <div class="course-card-footer"><a href="#corpusling-long" class="cc-link">View course →</a></div> </div> <div class="course-card"> <div class="course-card-header"> <span class="course-format-tag tag-long">Long Course</span> <h4><a href="#stats-long">Introduction to Statistics in the Humanities</a></h4> </div> <div class="course-card-body"> <div class="course-meta-row"><span class="meta-icon">👥</span> Students and researchers, all disciplines</div> <div class="course-meta-row"><span class="meta-icon">📅</span> 12 weeks · No background required</div> <p class="cc-desc">From probability and descriptive statistics through regression and mixed-effects modelling, using R throughout.</p> <div class="difficulty-bar"> <span class="db-label">Level</span> <div class="difficulty-dots"> <span class="active"></span><span class="active"></span><span></span><span></span> </div> <span style="font-size:0.72rem;color:#888;">Foundational</span> </div> </div> <div class="course-card-footer"><a href="#stats-long" class="cc-link">View course →</a></div> </div> <div class="course-card"> <div class="course-card-header"> <span class="course-format-tag tag-long">Long Course</span> <h4><a href="#advstats-long">Advanced Statistics in the Humanities</a></h4> </div> <div class="course-card-body"> <div class="course-meta-row"><span class="meta-icon">👥</span> Researchers with prior statistics knowledge</div> <div class="course-meta-row"><span class="meta-icon">📅</span> 12 weeks · Basic stats + R required</div> <p class="cc-desc">Multivariate modelling, classification trees, random forests, clustering, correspondence analysis, and survey data analysis.</p> <div class="difficulty-bar"> <span class="db-label">Level</span> <div class="difficulty-dots"> <span class="active"></span><span class="active"></span><span class="active"></span><span class="active"></span> </div> <span style="font-size:0.72rem;color:#888;">Advanced</span> </div> </div> <div class="course-card-footer"><a href="#advstats-long" class="cc-link">View course →</a></div> </div> </div> ``` --- ## Short Courses {#short-courses} ```{=html} <div class="section-banner"> <div class="sb-icon">⚡</div> <div> <h2>Self-Paced Short Courses</h2> <p>6–10 tutorials per course. Work through them in order at your own pace — no enrolment needed. Ideal for researchers building a specific skill quickly, or instructors embedding a focused module in a larger course.</p> </div> </div> ``` ### Introduction to Language Technology {#langtech} ```{=html} <div class="course-block"> <div class="course-block-header"> <div class="cbh-info"> <h3>Introduction to Language Technology</h3> <p>A first introduction to language technology — what it is, what it can do, and how to get started</p> </div> <div class="cbh-badges"> <span class="cbh-badge">6 tutorials</span> <span class="cbh-badge">No background required</span> <span class="cbh-badge free-badge">Free</span> </div> </div> <div class="course-block-meta"> <div class="cbm-item">👥 <span><strong>Audience:</strong> Anyone curious about how computers process and analyse language</span></div> <div class="cbm-item">🎯 <span><strong>Aim:</strong> Conceptual and practical first introduction — from text processing and regex to OCR and NLP overview</span></div> </div> <div class="course-block-body"> <p>Language technology encompasses the computational tools and methods used to analyse, generate, and interact with human language. This short course introduces learners to the landscape of language technology with hands-on practice in R. By the end, learners will understand the key methods and be equipped to explore more specialised pathways.</p> <div class="outcomes-grid"> <div class="outcome-item">A conceptual map of language technology and its applications in linguistics and the humanities</div> <div class="outcome-item">Practical experience loading, cleaning, and exploring text data in R</div> <div class="outcome-item">Familiarity with regular expressions as a foundation for all text-analytic work</div> <div class="outcome-item">Hands-on experience with OCR for converting PDFs and scanned documents to text</div> <div class="outcome-item">An understanding of how corpus tools and NLP pipelines are constructed</div> </div> <div class="tutorial-sequence"> <div class="tseq-item"> <div class="tseq-num">1</div> <div class="tseq-info"> <h5><a href="/tutorials/text_analysis_intro/text_analysis_intro.html">Introduction to Text Analysis</a></h5> <p>What text analysis is, how it relates to corpus linguistics and NLP, and key concepts: corpus, token, type, frequency, and concordance.</p> </div> </div> <div class="tseq-item"> <div class="tseq-num">2</div> <div class="tseq-info"> <h5><a href="/tutorials/r_intro/r_intro.html">Getting Started with R</a></h5> <p>Installing packages, loading data, working with vectors and data frames, and writing simple functions in R and RStudio.</p> </div> </div> <div class="tseq-item"> <div class="tseq-num">3</div> <div class="tseq-info"> <h5><a href="/tutorials/data_loading/data_loading.html">Loading and Saving Data</a></h5> <p>Importing text from plain text files, CSV, Excel, and web URLs — and saving results for later use.</p> </div> </div> <div class="tseq-item"> <div class="tseq-num">4</div> <div class="tseq-info"> <h5><a href="/tutorials/string/string.html">String Processing</a></h5> <p>Pattern matching, substitution, splitting, and the core string operations (using <code>stringr</code>) that underpin all text analysis.</p> </div> </div> <div class="tseq-item"> <div class="tseq-num">5</div> <div class="tseq-info"> <h5><a href="/tutorials/regular_expressions/regular_expressions.html">Regular Expressions</a></h5> <p>Character classes, quantifiers, anchors, and look-arounds with worked linguistic examples — the pattern language for searching and transforming text.</p> </div> </div> <div class="tseq-item"> <div class="tseq-num">6</div> <div class="tseq-info"> <h5><a href="/tutorials/pdf_to_text/pdf_to_text.html">Converting PDFs to Text</a></h5> <p>Extracting machine-readable text with <code>pdftools</code> (digital PDFs) and <code>tesseract</code> (scanned documents), including post-OCR spell-checking.</p> </div> </div> </div> </div> </div> ``` ### Introduction to Corpus Linguistics {#corpusling-short} ```{=html} <div class="course-block"> <div class="course-block-header aqua"> <div class="cbh-info"> <h3>Introduction to Corpus Linguistics</h3> <p>Core corpus methods — concordancing, collocations, keyness — using R and reproducible workflows</p> </div> <div class="cbh-badges"> <span class="cbh-badge">7 tutorials</span> <span class="cbh-badge">No background required</span> <span class="cbh-badge free-badge">Free</span> </div> </div> <div class="course-block-meta"> <div class="cbm-item">👥 <span><strong>Audience:</strong> Linguistics students; language teachers; researchers new to corpus methods</span></div> <div class="cbm-item">🎯 <span><strong>Aim:</strong> Introduce concordancing, collocations, and keyness with hands-on R practice</span></div> </div> <div class="course-block-body"> <p>Corpus linguistics uses large, principled collections of authentic text to investigate patterns of language use. This short course takes learners from a conceptual introduction through hands-on practice with the most widely used corpus methods, culminating in a case-study showcase integrating all techniques into a full corpus-based analysis.</p> <div class="outcomes-grid"> <div class="outcome-item">What a corpus is and how corpus-based research differs from introspective approaches</div> <div class="outcome-item">Practical skills in frequency analysis, concordancing, collocation, and keyword extraction using R</div> <div class="outcome-item">Ability to design, conduct, and report a reproducible corpus-based study</div> <div class="outcome-item">Familiarity with key R packages: <code>quanteda</code>, <code>tidytext</code>, and related tools</div> </div> <div class="tutorial-sequence"> <div class="tseq-item"><div class="tseq-num">1</div><div class="tseq-info"><h5><a href="/tutorials/text_analysis_intro/text_analysis_intro.html">Introduction to Text Analysis</a></h5><p>Key concepts — corpus, concordance, collocation, keyword, frequency — used throughout the course.</p></div></div> <div class="tseq-item"><div class="tseq-num">2</div><div class="tseq-info"><h5><a href="/tutorials/r_intro/r_intro.html">Getting Started with R</a></h5><p>First introduction to R and RStudio. Focus on the first four sections (up to Working with Tables).</p></div></div> <div class="tseq-item"><div class="tseq-num">3</div><div class="tseq-info"><h5><a href="/tutorials/string/string.html">String Processing</a></h5><p>Essential string manipulation: pattern matching, substitution, tokenisation preparation, and whitespace management.</p></div></div> <div class="tseq-item"><div class="tseq-num">4</div><div class="tseq-info"><h5><a href="/tutorials/concordancing/concordancing.html">Concordancing (Keywords-in-Context)</a></h5><p>KWIC concordance search and display in R — sorting, filtering, and interpreting concordance output.</p></div></div> <div class="tseq-item"><div class="tseq-num">5</div><div class="tseq-info"><h5><a href="/tutorials/coll/coll.html">Collocation and N-gram Analysis</a></h5><p>Statistically significant collocations and n-gram sequences — PMI, log-likelihood, t-score, and visualisation.</p></div></div> <div class="tseq-item"><div class="tseq-num">6</div><div class="tseq-info"><h5><a href="/tutorials/keywords/keywords.html">Keyness and Keyword Analysis</a></h5><p>Comparing two corpora to identify words that are statistically more or less frequent — the foundation of contrastive corpus analysis.</p></div></div> <div class="tseq-item"><div class="tseq-num">7</div><div class="tseq-info"><h5><a href="/tutorials/corplingr/corplingr.html">Corpus Linguistics with R</a></h5><p>Capstone showcase: complete case studies integrating concordancing, frequency analysis, collocations, and keyness.</p></div></div> </div> </div> </div> ``` ### Introduction to Text Analysis {#textanalysis-short} ```{=html} <div class="course-block"> <div class="course-block-header magenta"> <div class="cbh-info"> <h3>Introduction to Text Analysis</h3> <p>From text processing basics to topic modelling, sentiment analysis, and network analysis</p> </div> <div class="cbh-badges"> <span class="cbh-badge">7 tutorials</span> <span class="cbh-badge">No background required</span> <span class="cbh-badge free-badge">Free</span> </div> </div> <div class="course-block-meta"> <div class="cbm-item">👥 <span><strong>Audience:</strong> Humanities and social science students; researchers wanting computational approaches to text</span></div> <div class="cbm-item">🎯 <span><strong>Aim:</strong> Build practical R text analysis skills from cleaning and processing through to advanced methods</span></div> </div> <div class="course-block-body"> <p>Text analysis uses computational methods to extract patterns, topics, sentiment, and relational structure from large collections of text. This course builds from foundational R skills through to topic modelling, sentiment analysis, and network analysis. By the end, learners will be able to apply a range of text-analytic methods to their own research texts.</p> <div class="outcomes-grid"> <div class="outcome-item">An understanding of the major families of computational text analysis and their research applications</div> <div class="outcome-item">Practical R skills for cleaning, processing, and analysing text data</div> <div class="outcome-item">Hands-on experience with topic modelling, sentiment analysis, and network analysis</div> <div class="outcome-item">Ability to select the most appropriate method for a given research question</div> </div> <div class="tutorial-sequence"> <div class="tseq-item"><div class="tseq-num">1</div><div class="tseq-info"><h5><a href="/tutorials/text_analysis_intro/text_analysis_intro.html">Introduction to Text Analysis</a></h5><p>Overview of the field, key concepts, and situating text analysis within computational humanities research.</p></div></div> <div class="tseq-item"><div class="tseq-num">2</div><div class="tseq-info"><h5><a href="/tutorials/r_intro/r_intro.html">Getting Started with R</a></h5><p>First introduction to R and RStudio. Focus on the first four sections.</p></div></div> <div class="tseq-item"><div class="tseq-num">3</div><div class="tseq-info"><h5><a href="/tutorials/string/string.html">String Processing</a></h5><p>Core string manipulation skills for preparing raw text for analysis.</p></div></div> <div class="tseq-item"><div class="tseq-num">4</div><div class="tseq-info"><h5><a href="/tutorials/textanalysis/textanalysis.html">Practical Overview of Text Analytics Methods</a></h5><p>Frequency analysis, TF-IDF, and basic classification workflows using R.</p></div></div> <div class="tseq-item"><div class="tseq-num">5</div><div class="tseq-info"><h5><a href="/tutorials/topic/topic.html">Topic Modelling</a></h5><p>Latent Dirichlet Allocation (LDA) for discovering thematic structure in document collections — theory and R implementation.</p></div></div> <div class="tseq-item"><div class="tseq-num">6</div><div class="tseq-info"><h5><a href="/tutorials/sentiment/sentiment.html">Sentiment Analysis</a></h5><p>Lexicon-based and machine-learning approaches to opinion and emotion extraction, including dictionary methods and valence shifting.</p></div></div> <div class="tseq-item"><div class="tseq-num">7</div><div class="tseq-info"><h5><a href="/tutorials/network_analysis/network_analysis.html">Network Analysis</a></h5><p>Representing relational structure in textual and social data — node and edge construction, centrality measures, and visualisation.</p></div></div> </div> </div> </div> ``` ### Data Visualisation for Linguists {#dataviz-short} ```{=html} <div class="course-block"> <div class="course-block-header blue"> <div class="cbh-info"> <h3>Data Visualisation for Linguists</h3> <p>Publication-quality visualisation with ggplot2 — from frequency plots to maps</p> </div> <div class="cbh-badges"> <span class="cbh-badge">6 tutorials</span> <span class="cbh-badge">Basic R helpful</span> <span class="cbh-badge free-badge">Free</span> </div> </div> <div class="course-block-meta"> <div class="cbm-item">👥 <span><strong>Audience:</strong> Linguists and language researchers who want to communicate findings more effectively</span></div> <div class="cbm-item">🎯 <span><strong>Aim:</strong> Principled, publication-quality data visualisation with ggplot2 and linguistic data</span></div> </div> <div class="course-block-body"> <p>Effective visualisation is one of the most transferable skills in quantitative research. This course builds from visualisation principles through the mechanics of ggplot2, covering the graph types most commonly needed in linguistics: frequency distributions, scatter plots, heat maps, geographic maps, and interactive visualisations. Special attention is given to colour accessibility, annotations, and formatting for publication.</p> <div class="outcomes-grid"> <div class="outcome-item">A principled understanding of what makes a graph effective or misleading</div> <div class="outcome-item">Practical ggplot2 skills: geoms, scales, facets, themes, and annotations</div> <div class="outcome-item">Publication-quality static and interactive visualisations from linguistic data</div> <div class="outcome-item">Confidence choosing the right graph type for the right data and research question</div> </div> <div class="tutorial-sequence"> <div class="tseq-item"><div class="tseq-num">1</div><div class="tseq-info"><h5><a href="/tutorials/r_intro/r_intro.html">Getting Started with R</a></h5><p>Introduction to R with a focus on data structures and workflow needed for visualisation.</p></div></div> <div class="tseq-item"><div class="tseq-num">2</div><div class="tseq-info"><h5><a href="/tutorials/viz_intro/viz_intro.html">Introduction to Data Visualisation</a></h5><p>Visualisation philosophy, perceptual principles, grammar of graphics, and when to use which chart type.</p></div></div> <div class="tseq-item"><div class="tseq-num">3</div><div class="tseq-info"><h5><a href="/tutorials/dstats/dstats.html">Descriptive Statistics</a></h5><p>Summary statistics — means, medians, distributions, variance — that underpin most visualisations of linguistic data.</p></div></div> <div class="tseq-item"><div class="tseq-num">4</div><div class="tseq-info"><h5><a href="/tutorials/viz/viz.html">Data Visualisation with R</a></h5><p>In-depth ggplot2: histograms, density plots, box plots, bar charts, scatter plots, and line graphs with worked linguistic examples.</p></div></div> <div class="tseq-item"><div class="tseq-num">5</div><div class="tseq-info"><h5><a href="/tutorials/surveys/surveys.html">Visualising and Analysing Survey Data</a></h5><p>Cumulative density plots, diverging stacked bar charts, and Likert scale visualisation for questionnaire data.</p></div></div> <div class="tseq-item"><div class="tseq-num">6</div><div class="tseq-info"><h5><a href="/tutorials/maps/maps.html">Maps and Spatial Visualisation</a></h5><p>Dialect maps, distribution maps, and choropleth maps of linguistic data using ggplot2 and sf.</p></div></div> </div> </div> </div> ``` ### Introduction to Statistics {#stats-short} ```{=html} <div class="course-block"> <div class="course-block-header green"> <div class="cbh-info"> <h3>Introduction to Statistics in the Humanities and Social Sciences</h3> <p>Statistical literacy and practical quantitative skills from the ground up, using R throughout</p> </div> <div class="cbh-badges"> <span class="cbh-badge">7 tutorials</span> <span class="cbh-badge">No background required</span> <span class="cbh-badge free-badge">Free</span> </div> </div> <div class="course-block-meta"> <div class="cbm-item">👥 <span><strong>Audience:</strong> Humanities and social science students and researchers with little or no prior statistics knowledge</span></div> <div class="cbm-item">🎯 <span><strong>Aim:</strong> Build statistical literacy from first principles through inferential testing in R</span></div> </div> <div class="course-block-body"> <p>This course provides a conceptual and practical introduction to statistics for researchers whose background is in the humanities or social sciences. It begins with the philosophical foundations of quantitative reasoning and builds through descriptive statistics, visualisation, and inferential testing. By the end, learners will be able to conduct and interpret basic statistical analyses and communicate their results clearly.</p> <div class="outcomes-grid"> <div class="outcome-item">Solid conceptual understanding of statistical thinking, probability, and hypothesis testing</div> <div class="outcome-item">Practical R skills for summarising, tabulating, visualising, and testing data</div> <div class="outcome-item">Ability to select, apply, and interpret t-tests, chi-square, correlation, and simple regression</div> <div class="outcome-item">Confidence reading and critically evaluating quantitative results in published research</div> </div> <div class="tutorial-sequence"> <div class="tseq-item"><div class="tseq-num">1</div><div class="tseq-info"><h5><a href="/tutorials/quant_intro/quant_intro.html">Introduction to Quantitative Reasoning</a></h5><p>Scientific thinking, the logic of hypothesis testing, and the role of quantitative methods in humanities and social science research.</p></div></div> <div class="tseq-item"><div class="tseq-num">2</div><div class="tseq-info"><h5><a href="/tutorials/quant_basics/quant_basics.html">Basic Concepts in Quantitative Research</a></h5><p>Variables, data types, sampling, populations, reliability, and validity.</p></div></div> <div class="tseq-item"><div class="tseq-num">3</div><div class="tseq-info"><h5><a href="/tutorials/r_intro/r_intro.html">Getting Started with R</a></h5><p>Introduction to R and RStudio. Focus on the first four sections.</p></div></div> <div class="tseq-item"><div class="tseq-num">4</div><div class="tseq-info"><h5><a href="/tutorials/table/table.html">Handling Tables in R</a></h5><p>Importing, cleaning, reshaping, and summarising data frames using dplyr and tidyr.</p></div></div> <div class="tseq-item"><div class="tseq-num">5</div><div class="tseq-info"><h5><a href="/tutorials/dstats/dstats.html">Descriptive Statistics</a></h5><p>Means, medians, standard deviations, distributions, and frequency tables in R.</p></div></div> <div class="tseq-item"><div class="tseq-num">6</div><div class="tseq-info"><h5><a href="/tutorials/viz_intro/viz_intro.html">Introduction to Data Visualisation</a></h5><p>Visualisation principles and hands-on practice creating and customising graphs in R.</p></div></div> <div class="tseq-item"><div class="tseq-num">7</div><div class="tseq-info"><h5><a href="/tutorials/inferential_stats/inferential_stats.html">Basic Inferential Statistics</a></h5><p>Hypothesis testing, p-values, confidence intervals, t-tests, chi-square, correlation, and simple linear regression with R exercises.</p></div></div> </div> </div> </div> ``` ### Introduction to Learner Corpus Research {#lcr-short} ```{=html} <div class="course-block"> <div class="course-block-header"> <div class="cbh-info"> <h3>Introduction to Learner Corpus Research</h3> <p>Corpus methods for studying learner language — from frequency comparison to error analysis</p> </div> <div class="cbh-badges"> <span class="cbh-badge">7 tutorials</span> <span class="cbh-badge">Basic corpus linguistics helpful</span> <span class="cbh-badge free-badge">Free</span> </div> </div> <div class="course-block-meta"> <div class="cbm-item">👥 <span><strong>Audience:</strong> Applied linguists; SLA researchers; language teachers and test developers</span></div> <div class="cbm-item">🎯 <span><strong>Aim:</strong> Introduce LCR methods from corpus construction through to lexical diversity, readability, and error analysis</span></div> </div> <div class="course-block-body"> <p>Learner corpus research uses collections of authentic language produced by second-language learners to investigate the structure, development, and distinctiveness of interlanguage. This course covers the major analytical methods — concordancing, frequency comparison, collocation, POS tagging, lexical diversity, and error analysis — using the ICLE and LOCNESS corpora as running examples.</p> <div class="outcomes-grid"> <div class="outcome-item">What learner corpora are and how they differ from native-speaker corpora</div> <div class="outcome-item">Skills for comparing learner and native-speaker language quantitatively using R</div> <div class="outcome-item">Experience with lexical diversity measures, readability scores, and spelling error detection</div> <div class="outcome-item">Ability to design and interpret a basic learner corpus study in the context of SLA theory</div> </div> <div class="tutorial-sequence"> <div class="tseq-item"><div class="tseq-num">1</div><div class="tseq-info"><h5><a href="/tutorials/text_analysis_intro/text_analysis_intro.html">Introduction to Text Analysis</a></h5><p>Key concepts — corpus, frequency, concordance, collocation — underpinning learner corpus research.</p></div></div> <div class="tseq-item"><div class="tseq-num">2</div><div class="tseq-info"><h5><a href="/tutorials/r_intro/r_intro.html">Getting Started with R</a></h5><p>Data structures and workflow for corpus analysis.</p></div></div> <div class="tseq-item"><div class="tseq-num">3</div><div class="tseq-info"><h5><a href="/tutorials/string/string.html">String Processing</a></h5><p>Cleaning, normalising, splitting, and extracting character patterns from raw learner corpus texts.</p></div></div> <div class="tseq-item"><div class="tseq-num">4</div><div class="tseq-info"><h5><a href="/tutorials/concordancing/concordancing.html">Concordancing (Keywords-in-Context)</a></h5><p>Extracting and inspecting KWIC concordances from learner texts to investigate how learners use specific words or constructions.</p></div></div> <div class="tseq-item"><div class="tseq-num">5</div><div class="tseq-info"><h5><a href="/tutorials/coll/coll.html">Collocation and N-gram Analysis</a></h5><p>Comparing collocational patterns between learner and native-speaker corpora for studying collocational competence and L1 transfer.</p></div></div> <div class="tseq-item"><div class="tseq-num">6</div><div class="tseq-info"><h5><a href="/tutorials/learner_language/learner_language.html">Analysing Learner Language with R</a></h5><p>Frequency comparison, POS tagging, lexical diversity, readability scores, and spelling error detection with ICLE and LOCNESS examples.</p></div></div> <div class="tseq-item"><div class="tseq-num">7</div><div class="tseq-info"><h5><a href="/tutorials/keywords/keywords.html">Keyness and Keyword Analysis</a></h5><p>Words systematically over- or under-used by learners relative to native-speaker norms — one of the most informative methods in LCR.</p></div></div> </div> </div> </div> ``` ### Natural Language Processing with R {#nlp-short} ```{=html} <div class="course-block"> <div class="course-block-header dark"> <div class="cbh-info"> <h3>Natural Language Processing with R</h3> <p>Text preprocessing, feature extraction, classification, NER, and transformer-based representations</p> </div> <div class="cbh-badges"> <span class="cbh-badge">7 tutorials</span> <span class="cbh-badge">Intermediate R required</span> <span class="cbh-badge free-badge">Free</span> </div> </div> <div class="course-block-meta"> <div class="cbm-item">👥 <span><strong>Audience:</strong> Computational linguists; data scientists working with language data</span></div> <div class="cbm-item">🎯 <span><strong>Prerequisite:</strong> Intermediate R skills; basic familiarity with descriptive statistics and simple regression</span></div> </div> <div class="course-block-body"> <p>NLP builds on corpus and statistical methods to develop computational pipelines for understanding and generating language at scale. This course introduces the NLP workflow in R using real linguistic datasets, progressing from text preprocessing and feature engineering to supervised classification, topic models, and an introduction to working with large language model embeddings and APIs.</p> <div class="outcomes-grid"> <div class="outcome-item">Clear understanding of the NLP pipeline from raw text to structured, analysable representations</div> <div class="outcome-item">Practical preprocessing skills: tokenisation, stopword removal, stemming, and lemmatisation</div> <div class="outcome-item">Experience building document-feature matrices and applying TF-IDF weighting</div> <div class="outcome-item">Hands-on practice with text classification, NER, and dependency parsing</div> <div class="outcome-item">Introduction to word embeddings and transformer-based representations</div> </div> <div class="tutorial-sequence"> <div class="tseq-item"><div class="tseq-num">1</div><div class="tseq-info"><h5><a href="/tutorials/text_analysis_intro/text_analysis_intro.html">Introduction to Text Analysis</a></h5><p>Situating NLP within corpus linguistics and computational linguistics.</p></div></div> <div class="tseq-item"><div class="tseq-num">2</div><div class="tseq-info"><h5><a href="/tutorials/string/string.html">String Processing</a></h5><p>Foundation string manipulation — essential for all preprocessing steps in NLP pipelines.</p></div></div> <div class="tseq-item"><div class="tseq-num">3</div><div class="tseq-info"><h5><a href="/tutorials/regular_expressions/regular_expressions.html">Regular Expressions</a></h5><p>Regex as the primary pattern-matching tool in text preprocessing and feature extraction.</p></div></div> <div class="tseq-item"><div class="tseq-num">4</div><div class="tseq-info"><h5><a href="/tutorials/textanalysis/textanalysis.html">Practical Overview of Text Analytics Methods</a></h5><p>Document-feature matrices, TF-IDF, and basic classification workflows in R.</p></div></div> <div class="tseq-item"><div class="tseq-num">5</div><div class="tseq-info"><h5><a href="/tutorials/topic/topic.html">Topic Modelling</a></h5><p>Probabilistic topic models as an unsupervised NLP method for discovering thematic structure.</p></div></div> <div class="tseq-item"><div class="tseq-num">6</div><div class="tseq-info"><h5><a href="/tutorials/learner_language/learner_language.html">Analysing Learner Language with R</a></h5><p>POS tagging with udpipe, sequence analysis, and lexical diversity measures — key NLP tasks applied to real corpus data.</p></div></div> <div class="tseq-item"><div class="tseq-num">7</div><div class="tseq-info"><h5><a href="/tutorials/network_analysis/network_analysis.html">Network Analysis</a></h5><p>Representing relational structure in language data — semantic networks, co-occurrence graphs, and social networks of linguistic interaction.</p></div></div> </div> </div> </div> ``` --- ## Long Courses {#long-courses} ```{=html} <div class="section-banner"> <div class="sb-icon">📅</div> <div> <h2>Semester-Length Long Courses</h2> <p>Structured as 12-week programmes with weekly lectures, LADAL tutorials, and recommended readings. Designed to scaffold a full university course — or for motivated independent learners who want a thorough grounding in a field.</p> </div> </div> ``` ### Introduction to Digital Humanities with R {#dh-long} ```{=html} <div class="course-block"> <div class="course-block-header aqua"> <div class="cbh-info"> <h3>Introduction to Digital Humanities with R</h3> <p>Computational methods for humanistic inquiry — from data literacy through corpus analysis, networks, and mapping</p> </div> <div class="cbh-badges"> <span class="cbh-badge">12 weeks</span> <span class="cbh-badge">No background required</span> <span class="cbh-badge free-badge">Free</span> </div> </div> <div class="course-block-meta"> <div class="cbm-item">👥 <span><strong>Audience:</strong> Literature, history, cultural studies, linguistics, media studies students and researchers</span></div> <div class="cbm-item">🕐 <span><strong>Structure:</strong> 1h lecture + 1.5h tutorial per week</span></div> <div class="cbm-item">🎯 <span><strong>Aim:</strong> Design, conduct, and communicate a reproducible computational analysis of a humanities dataset</span></div> </div> <div class="course-block-body"> <p>Digital humanities applies computational methods to humanistic inquiry: analysing large literary corpora, mapping cultural data geographically, tracing discourse patterns across historical archives, or modelling networks of social interaction. This 12-week course introduces students to the core DH toolkit through R, with weekly tutorials grounded in real humanities datasets. No prior programming experience is assumed.</p> <div class="week-accordion"> <details class="week-item"> <summary><span class="week-num-badge">Week 1</span><span class="week-title">What Is Digital Humanities?</span><span class="week-chevron">▾</span></summary> <div class="week-body"> <div class="week-section-label">Lecture topics</div> <p>Overview of digital humanities — history, debates, and current landscape; relationship to corpus linguistics, text analysis, and data science; what counts as DH research.</p> <div class="week-section-label">LADAL tutorials</div> <div class="week-tutorial-links"><a href="/tutorials/text_analysis_intro/text_analysis_intro.html" class="wtl-link">Introduction to Text Analysis</a></div> <div class="week-section-label">Readings</div> <ul><li>Burdick et al. (2012). <em>Digital humanities.</em> MIT Press, Ch. 1</li><li>Drucker (2021). <em>The digital humanities coursebook.</em> Routledge, Ch. 1</li></ul> </div> </details> <details class="week-item"> <summary><span class="week-num-badge">Week 2</span><span class="week-title">Reproducible Research and Data Management</span><span class="week-chevron">▾</span></summary> <div class="week-body"> <div class="week-section-label">Lecture topics</div> <p>Why reproducibility matters in DH; introduction to R and RStudio; file organisation, project workflows, and version control basics.</p> <div class="week-section-label">LADAL tutorials</div> <div class="week-tutorial-links"><a href="/tutorials/reproducibility/reproducibility.html" class="wtl-link">Reproducible Research</a><a href="/tutorials/notebooks/notebooks.html" class="wtl-link">Creating R Notebooks</a></div> <div class="week-section-label">Readings</div> <ul><li>Flanagan, J. (2025). Reproducibility, replicability, robustness, and generalizability in corpus linguistics. <em>International Journal of Corpus Linguistics.</em> <a href="https://doi.org/10.1075/ijcl.24113.fla">doi:10.1075/ijcl.24113.fla</a></li></ul> </div> </details> <details class="week-item"> <summary><span class="week-num-badge">Week 3</span><span class="week-title">Getting Started with R</span><span class="week-chevron">▾</span></summary> <div class="week-body"> <div class="week-section-label">Lecture topics</div> <p>R syntax, data types, vectors, and data frames; the tidyverse ecosystem; reading and writing data.</p> <div class="week-section-label">LADAL tutorials</div> <div class="week-tutorial-links"><a href="/tutorials/r_intro/r_intro.html" class="wtl-link">Getting Started with R</a><a href="/tutorials/data_loading/data_loading.html" class="wtl-link">Loading and Saving Data</a></div> <div class="week-section-label">Readings</div> <ul><li>Wickham & Grolemund (2016). <em>R for data science.</em> Ch. 1–3. <a href="https://r4ds.had.co.nz">r4ds.had.co.nz</a></li></ul> </div> </details> <details class="week-item"> <summary><span class="week-num-badge">Week 4</span><span class="week-title">Working with Text Data</span><span class="week-chevron">▾</span></summary> <div class="week-body"> <div class="week-section-label">Lecture topics</div> <p>How text is represented computationally; encoding, tokenisation, and the document-feature matrix; from raw text to structured data.</p> <div class="week-section-label">LADAL tutorials</div> <div class="week-tutorial-links"><a href="/tutorials/string/string.html" class="wtl-link">String Processing</a><a href="/tutorials/regular_expressions/regular_expressions.html" class="wtl-link">Regular Expressions</a></div> <div class="week-section-label">Readings</div> <ul><li>Jockers, M. L. (2014). <em>Text analysis with R for students of literature.</em> Springer, Ch. 1–3</li></ul> </div> </details> <details class="week-item"> <summary><span class="week-num-badge">Week 5</span><span class="week-title">Building and Exploring Digital Corpora</span><span class="week-chevron">▾</span></summary> <div class="week-body"> <div class="week-section-label">Lecture topics</div> <p>What is a corpus? Sampling principles, metadata, corpus design for humanities research; downloading and preparing digital texts.</p> <div class="week-section-label">LADAL tutorials</div> <div class="week-tutorial-links"><a href="/tutorials/gutenberg/gutenberg.html" class="wtl-link">Downloading from Project Gutenberg</a><a href="/tutorials/pdf_to_text/pdf_to_text.html" class="wtl-link">Converting PDFs to Text</a></div> <div class="week-section-label">Readings</div> <ul><li>Biber, Conrad & Reppen (1998). <em>Corpus linguistics.</em> Cambridge University Press, Ch. 1–2</li></ul> </div> </details> <details class="week-item"> <summary><span class="week-num-badge">Week 6</span><span class="week-title">Frequency Analysis and Visualisation</span><span class="week-chevron">▾</span></summary> <div class="week-body"> <div class="week-section-label">Lecture topics</div> <p>Zipf's law and frequency distributions; word counts, type-token ratios, and dispersion; principles of effective visualisation for humanities data.</p> <div class="week-section-label">LADAL tutorials</div> <div class="week-tutorial-links"><a href="/tutorials/viz_intro/viz_intro.html" class="wtl-link">Introduction to Data Visualisation</a><a href="/tutorials/dstats/dstats.html" class="wtl-link">Descriptive Statistics</a></div> <div class="week-section-label">Readings</div> <ul><li>Jockers (2014), Ch. 4–5</li></ul> </div> </details> <details class="week-item"> <summary><span class="week-num-badge">Week 7</span><span class="week-title">Concordancing, Collocations, and Keywords</span><span class="week-chevron">▾</span></summary> <div class="week-body"> <div class="week-section-label">Lecture topics</div> <p>Searching corpora; KWIC concordances and their interpretation; collocation and association measures; keyness and corpus comparison.</p> <div class="week-section-label">LADAL tutorials</div> <div class="week-tutorial-links"><a href="/tutorials/concordancing/concordancing.html" class="wtl-link">Concordancing with R</a><a href="/tutorials/keywords/keywords.html" class="wtl-link">Keyness Analysis</a></div> <div class="week-section-label">Readings</div> <ul><li>Baker, P. (2006). <em>Using corpora in discourse analysis.</em> Continuum, Ch. 3–4</li></ul> </div> </details> <details class="week-item"> <summary><span class="week-num-badge">Week 8</span><span class="week-title">Topic Modelling and Thematic Analysis</span><span class="week-chevron">▾</span></summary> <div class="week-body"> <div class="week-section-label">Lecture topics</div> <p>Latent Dirichlet Allocation (LDA); interpreting topics; applications in literary and historical research; limitations and critical perspectives.</p> <div class="week-section-label">LADAL tutorials</div> <div class="week-tutorial-links"><a href="/tutorials/topic/topic.html" class="wtl-link">Topic Modelling</a></div> <div class="week-section-label">Readings</div> <ul><li>Blei, D. M. (2012). Probabilistic topic models. <em>Communications of the ACM</em>, 55(4), 77–84</li><li>Maier et al. (2021). Applying LDA topic modeling in communication research. In <em>Computational methods for communication science</em> (pp. 13–38). Routledge</li></ul> </div> </details> <details class="week-item"> <summary><span class="week-num-badge">Week 9</span><span class="week-title">Sentiment Analysis and Opinion Mining</span><span class="week-chevron">▾</span></summary> <div class="week-body"> <div class="week-section-label">Lecture topics</div> <p>Lexicon-based and machine learning approaches to sentiment; subjectivity, valence, and emotion; applications in literary and media studies.</p> <div class="week-section-label">LADAL tutorials</div> <div class="week-tutorial-links"><a href="/tutorials/sentiment/sentiment.html" class="wtl-link">Sentiment Analysis</a></div> <div class="week-section-label">Readings</div> <ul><li>Liu, B. (2012). <em>Sentiment analysis and opinion mining.</em> Ch. 1–2</li></ul> </div> </details> <details class="week-item"> <summary><span class="week-num-badge">Week 10</span><span class="week-title">Network Analysis for Humanities Research</span><span class="week-chevron">▾</span></summary> <div class="week-body"> <div class="week-section-label">Lecture topics</div> <p>Graphs and networks as representations of humanistic data; character networks in fiction; citation and social networks; centrality and community detection.</p> <div class="week-section-label">LADAL tutorials</div> <div class="week-tutorial-links"><a href="/tutorials/network_analysis/network_analysis.html" class="wtl-link">Network Analysis</a></div> <div class="week-section-label">Readings</div> <ul><li>Moretti, F. (2011). Network theory, plot analysis. <em>New Left Review</em>, 68, 80–102</li></ul> </div> </details> <details class="week-item"> <summary><span class="week-num-badge">Week 11</span><span class="week-title">Maps, Space, and Geographic Visualisation</span><span class="week-chevron">▾</span></summary> <div class="week-body"> <div class="week-section-label">Lecture topics</div> <p>Spatial thinking in digital humanities; mapping literary geography, dialect distribution, and historical events; choropleth maps and point maps in R.</p> <div class="week-section-label">LADAL tutorials</div> <div class="week-tutorial-links"><a href="/tutorials/maps/maps.html" class="wtl-link">Maps and Spatial Visualisation</a></div> <div class="week-section-label">Readings</div> <ul><li>Drucker (2021), Ch. 8</li></ul> </div> </details> <details class="week-item"> <summary><span class="week-num-badge">Week 12</span><span class="week-title">Project Workshop and Critical Reflections</span><span class="week-chevron">▾</span></summary> <div class="week-body"> <div class="week-section-label">Lecture topics</div> <p>Critical DH — bias in corpora and algorithms, data ethics, representation, and positionality; communicating DH research; the future of digital humanities.</p> <div class="week-section-label">Tutorial</div> <p>Student project presentations and peer feedback.</p> </div> </details> </div> <div class="reading-list"> <h5>📚 Core Reading List</h5> <ul> <li>Baker, P. (2006). <em>Using corpora in discourse analysis.</em> Continuum.</li> <li>Biber, D., Conrad, S., & Reppen, R. (1998). <em>Corpus linguistics.</em> Cambridge University Press.</li> <li>Burdick, A., et al. (2012). <em>Digital humanities.</em> MIT Press.</li> <li>Drucker, J. (2021). <em>The digital humanities coursebook.</em> Routledge.</li> <li>Flanagan, J. (2025). Reproducibility in corpus linguistics. <em>International Journal of Corpus Linguistics.</em> <a href="https://doi.org/10.1075/ijcl.24113.fla">doi:10.1075/ijcl.24113.fla</a></li> <li>Jockers, M. L. (2014). <em>Text analysis with R for students of literature.</em> Springer.</li> <li>Liu, B. (2012). Sentiment analysis and opinion mining. <em>Synthesis Lectures on Human Language Technologies</em>, 5(1), 1–167.</li> <li>Wickham, H., & Grolemund, G. (2016). <em>R for data science.</em> O'Reilly. <a href="https://r4ds.had.co.nz">r4ds.had.co.nz</a></li> </ul> </div> </div> </div> ``` ### Corpus Linguistics and Text Analysis with R {#corpusling-long} ```{=html} <div class="course-block"> <div class="course-block-header"> <div class="cbh-info"> <h3>Introduction to Corpus Linguistics and Text Analysis with R</h3> <p>Corpus construction through concordancing, keywords, topic modelling, and network analysis</p> </div> <div class="cbh-badges"> <span class="cbh-badge">12 weeks</span> <span class="cbh-badge">No background required</span> <span class="cbh-badge free-badge">Free</span> </div> </div> <div class="course-block-meta"> <div class="cbm-item">👥 <span><strong>Audience:</strong> Students in linguistics, applied linguistics, translation, communication, and literary studies</span></div> <div class="cbm-item">🕐 <span><strong>Structure:</strong> 1h lecture + 1.5h tutorial per week</span></div> <div class="cbm-item">🎯 <span><strong>Aim:</strong> Introduce corpus-based methods and hands-on text analysis in R</span></div> </div> <div class="course-block-body"> <div class="week-accordion"> <details class="week-item"><summary><span class="week-num-badge">Week 1</span><span class="week-title">Introduction to Corpus Linguistics and Text Analytics</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>What is corpus linguistics? Key concepts, history, and applications; corpus vs. introspective and experimental methods; overview of the course.</p><div class="week-section-label">LADAL tutorials</div><div class="week-tutorial-links"><a href="/tutorials/text_analysis_intro/text_analysis_intro.html" class="wtl-link">Introduction to Text Analysis</a></div><div class="week-section-label">Readings</div><ul><li>McEnery & Hardie (2012). <em>Corpus linguistics: Method, theory and practice.</em> CUP, Ch. 1–2</li></ul></div></details> <details class="week-item"><summary><span class="week-num-badge">Week 2</span><span class="week-title">Working with Digital Data and Reproducibility</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>Principles of reproducible research; introduction to R Notebooks; file management and workflow.</p><div class="week-section-label">LADAL tutorials</div><div class="week-tutorial-links"><a href="/tutorials/reproducibility/reproducibility.html" class="wtl-link">Reproducible Research</a><a href="/tutorials/notebooks/notebooks.html" class="wtl-link">Creating R Notebooks</a></div><div class="week-section-label">Readings</div><ul><li>Flanagan (2025). Reproducibility in corpus linguistics. <a href="https://doi.org/10.1075/ijcl.24113.fla">doi:10.1075/ijcl.24113.fla</a></li></ul></div></details> <details class="week-item"><summary><span class="week-num-badge">Week 3</span><span class="week-title">Getting Started with R</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>R and RStudio; installing packages; basic syntax; workflow setup.</p><div class="week-section-label">LADAL tutorials</div><div class="week-tutorial-links"><a href="/tutorials/why_r/why_r.html" class="wtl-link">Why R for Corpus Linguistics</a><a href="/tutorials/r_intro/r_intro.html" class="wtl-link">Getting Started with R</a><a href="/tutorials/data_loading/data_loading.html" class="wtl-link">Loading and Saving Data</a></div><div class="week-section-label">Readings</div><ul><li>Wickham & Grolemund (2016), Ch. 1–3</li></ul></div></details> <details class="week-item"><summary><span class="week-num-badge">Week 4</span><span class="week-title">Corpus Compilation and Preparation</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>Types of corpora; sampling principles and representativeness; metadata and annotation; legal and ethical issues in corpus construction.</p><div class="week-section-label">LADAL tutorials</div><div class="week-tutorial-links"><a href="/tutorials/gutenberg/gutenberg.html" class="wtl-link">Downloading from Project Gutenberg</a></div><div class="week-section-label">Readings</div><ul><li>Biber, Conrad & Reppen (1998), Ch. 1–2</li></ul></div></details> <details class="week-item"><summary><span class="week-num-badge">Week 5</span><span class="week-title">Frequency and Dispersion</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>Counting words and n-grams; Zipf's law; normalised frequencies; dispersion measures and why they matter; type-token ratio.</p><div class="week-section-label">LADAL tutorials</div><div class="week-tutorial-links"><a href="/tutorials/table/table.html" class="wtl-link">Handling Tables in R</a></div><div class="week-section-label">Readings</div><ul><li>McEnery & Hardie (2012), Ch. 3</li><li>Gries (2024). <em>Frequency, dispersion, association, and keyness.</em> Ch. 1–2</li></ul></div></details> <details class="week-item"><summary><span class="week-num-badge">Week 6</span><span class="week-title">Concordancing and KWIC</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>Searching corpora; concordance displays and their interpretation; sorting and filtering; from examples to patterns.</p><div class="week-section-label">LADAL tutorials</div><div class="week-tutorial-links"><a href="/tutorials/concordancing/concordancing.html" class="wtl-link">Concordancing with R</a></div><div class="week-section-label">Readings</div><ul><li>Baker (2006), Ch. 3</li></ul></div></details> <details class="week-item"><summary><span class="week-num-badge">Week 7</span><span class="week-title">Collocations and N-grams</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>Association measures (MI, t-score, log-likelihood, Dice); phraseology and formulaic sequences; n-gram extraction and analysis.</p><div class="week-section-label">LADAL tutorials</div><div class="week-tutorial-links"><a href="/tutorials/coll/coll.html" class="wtl-link">Collocation and N-gram Analysis</a></div><div class="week-section-label">Readings</div><ul><li>Gries (2024), Ch. 2</li></ul></div></details> <details class="week-item"><summary><span class="week-num-badge">Week 8</span><span class="week-title">Keywords and Keyness</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>Reference corpora and keyness; log-likelihood and log ratio as keyness measures; interpretation and applications in discourse analysis.</p><div class="week-section-label">LADAL tutorials</div><div class="week-tutorial-links"><a href="/tutorials/keywords/keywords.html" class="wtl-link">Keyness and Keyword Analysis</a></div><div class="week-section-label">Readings</div><ul><li>Gries (2024), Ch. 3</li></ul></div></details> <details class="week-item"><summary><span class="week-num-badge">Week 9</span><span class="week-title">Advanced Text Analytics I — Topic Modelling</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>Unsupervised text classification; LDA and its assumptions; interpreting and validating topic models; applications in linguistics and discourse analysis.</p><div class="week-section-label">LADAL tutorials</div><div class="week-tutorial-links"><a href="/tutorials/topic/topic.html" class="wtl-link">Topic Modelling</a></div><div class="week-section-label">Readings</div><ul><li>Maier et al. (2021). Applying LDA topic modeling in communication research (pp. 13–38)</li></ul></div></details> <details class="week-item"><summary><span class="week-num-badge">Week 10</span><span class="week-title">Advanced Text Analytics II — Sentiment and Network Analysis</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>Sentiment lexicons; opinion mining; co-occurrence networks and semantic networks from corpus data.</p><div class="week-section-label">LADAL tutorials</div><div class="week-tutorial-links"><a href="/tutorials/sentiment/sentiment.html" class="wtl-link">Sentiment Analysis</a><a href="/tutorials/network_analysis/network_analysis.html" class="wtl-link">Network Analysis</a></div><div class="week-section-label">Readings</div><ul><li>Liu (2012), Ch. 1–2</li></ul></div></details> <details class="week-item"><summary><span class="week-num-badge">Week 11</span><span class="week-title">Case Studies in Corpus Linguistics</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>Corpus-based studies of grammar, lexis, and discourse; from method to interpretation; writing up corpus research.</p><div class="week-section-label">LADAL tutorials</div><div class="week-tutorial-links"><a href="/tutorials/corplingr/corplingr.html" class="wtl-link">Corpus Linguistics with R</a></div><div class="week-section-label">Readings</div><ul><li>Baker (2006), Ch. 7</li></ul></div></details> <details class="week-item"><summary><span class="week-num-badge">Week 12</span><span class="week-title">Project Workshop and Presentations</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>Ethics in corpus research; future directions; communicating corpus findings to non-specialist audiences.</p><div class="week-section-label">Tutorial</div><p>Student project work.</p></div></details> </div> <div class="reading-list"> <h5>📚 Core Reading List</h5> <ul> <li>Baker, P. (2006). <em>Using corpora in discourse analysis.</em> Continuum.</li> <li>Biber, D., Conrad, S., & Reppen, R. (1998). <em>Corpus linguistics.</em> Cambridge University Press.</li> <li>Flanagan, J. (2025). Reproducibility in corpus linguistics. <em>International Journal of Corpus Linguistics.</em> <a href="https://doi.org/10.1075/ijcl.24113.fla">doi:10.1075/ijcl.24113.fla</a></li> <li>Gries, S. T. (2024). <em>Frequency, dispersion, association, and keyness</em> (Studies in Corpus Linguistics, Vol. 115). John Benjamins.</li> <li>Liu, B. (2012). Sentiment analysis and opinion mining. <em>Synthesis Lectures on Human Language Technologies</em>, 5(1), 1–167.</li> <li>McEnery, T., & Hardie, A. (2012). <em>Corpus linguistics: Method, theory and practice.</em> Cambridge University Press.</li> <li>Wickham, H., & Grolemund, G. (2016). <em>R for data science.</em> <a href="https://r4ds.had.co.nz">r4ds.had.co.nz</a></li> </ul> </div> </div> </div> ``` ### Introduction to Statistics in the Humanities {#stats-long} ```{=html} <div class="course-block"> <div class="course-block-header magenta"> <div class="cbh-info"> <h3>Introduction to Statistics in the Humanities and Social Sciences</h3> <p>Probability and descriptive statistics through regression and mixed-effects modelling, using R throughout</p> </div> <div class="cbh-badges"> <span class="cbh-badge">12 weeks</span> <span class="cbh-badge">No background required</span> <span class="cbh-badge free-badge">Free</span> </div> </div> <div class="course-block-meta"> <div class="cbm-item">👥 <span><strong>Audience:</strong> Students and researchers in linguistics, psychology, education, sociology, and related fields</span></div> <div class="cbm-item">🕐 <span><strong>Structure:</strong> 1h lecture + 1.5h tutorial per week</span></div> <div class="cbm-item">🎯 <span><strong>Aim:</strong> Practical and conceptual foundation in quantitative methods, no prior knowledge assumed</span></div> </div> <div class="course-block-body"> <div class="week-accordion"> <details class="week-item"><summary><span class="week-num-badge">Week 1</span><span class="week-title">Introduction to Quantitative Research</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>The role of quantitative methods in humanities and social sciences; an overview of statistical thinking; the research cycle; types of research questions.</p><div class="week-section-label">LADAL tutorials</div><div class="week-tutorial-links"><a href="/tutorials/quant_intro/quant_intro.html" class="wtl-link">Introduction to Quantitative Reasoning</a></div><div class="week-section-label">Readings</div><ul><li>Field, Miles & Field (2012). <em>Discovering statistics using R.</em> Ch. 1</li><li>Baayen (2008). <em>Analyzing linguistic data.</em> Ch. 1</li></ul></div></details> <details class="week-item"><summary><span class="week-num-badge">Week 2</span><span class="week-title">Basic Concepts in Quantitative Research</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>Data types and measurement scales; variables, operationalisation, and construct validity; sampling and representativeness; reliability and validity.</p><div class="week-section-label">LADAL tutorials</div><div class="week-tutorial-links"><a href="/tutorials/quant_basics/quant_basics.html" class="wtl-link">Basic Concepts in Quantitative Research</a></div><div class="week-section-label">Readings</div><ul><li>Gries (2013). <em>Statistics for linguists.</em> Ch. 1–2</li></ul></div></details> <details class="week-item"><summary><span class="week-num-badge">Week 3</span><span class="week-title">Getting Started with R — Part 1</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>Introduction to R and RStudio; installing and loading packages; basic syntax and data structures; the tidyverse ecosystem.</p><div class="week-section-label">LADAL tutorials</div><div class="week-tutorial-links"><a href="/tutorials/r_intro/r_intro.html" class="wtl-link">Getting Started with R</a></div><div class="week-section-label">Readings</div><ul><li>Wickham & Grolemund (2016), Ch. 1–3</li></ul></div></details> <details class="week-item"><summary><span class="week-num-badge">Week 4</span><span class="week-title">Loading and Handling Data</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>Importing datasets from CSV, Excel, and text files; data cleaning and transformation; working with factors and missing values.</p><div class="week-section-label">LADAL tutorials</div><div class="week-tutorial-links"><a href="/tutorials/data_loading/data_loading.html" class="wtl-link">Loading and Saving Data</a><a href="/tutorials/table/table.html" class="wtl-link">Handling Tables in R</a></div><div class="week-section-label">Readings</div><ul><li>Baayen (2008), Ch. 2</li></ul></div></details> <details class="week-item"><summary><span class="week-num-badge">Week 5</span><span class="week-title">R Basics for Statistical Analysis</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>Vectors, factors, data frames, indexing, and subsetting; writing functions; applying operations across groups with dplyr.</p><div class="week-section-label">LADAL tutorials</div><div class="week-tutorial-links"><a href="/tutorials/r_intro/r_intro.html" class="wtl-link">Getting Started with R (advanced sections)</a></div><div class="week-section-label">Readings</div><ul><li>Field, Miles & Field (2012), Ch. 2–3</li></ul></div></details> <details class="week-item"><summary><span class="week-num-badge">Week 6</span><span class="week-title">Descriptive Statistics</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>Measures of central tendency and dispersion; frequency distributions; skewness and kurtosis; the normal distribution; summarising grouped data.</p><div class="week-section-label">LADAL tutorials</div><div class="week-tutorial-links"><a href="/tutorials/dstats/dstats.html" class="wtl-link">Descriptive Statistics</a></div><div class="week-section-label">Readings</div><ul><li>Baayen (2008), Ch. 3</li><li>Winter (2019). <em>Statistics for linguists.</em> Ch. 2</li></ul></div></details> <details class="week-item"><summary><span class="week-num-badge">Week 7</span><span class="week-title">Visualising Data</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>Principles of effective visualisation; histograms, box plots, scatter plots, and bar charts; ggplot2 grammar of graphics.</p><div class="week-section-label">LADAL tutorials</div><div class="week-tutorial-links"><a href="/tutorials/viz/viz.html" class="wtl-link">Data Visualisation with R</a></div><div class="week-section-label">Readings</div><ul><li>Wickham & Grolemund (2016), Ch. 14</li></ul></div></details> <details class="week-item"><summary><span class="week-num-badge">Week 8</span><span class="week-title">Hypothesis Testing and Power Analysis</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>The logic of null hypothesis significance testing; t-tests, ANOVA; p-values and their interpretation; effect sizes; statistical power and sample size planning.</p><div class="week-section-label">LADAL tutorials</div><div class="week-tutorial-links"><a href="/tutorials/inferential_stats/inferential_stats.html" class="wtl-link">Basic Inferential Statistics</a></div><div class="week-section-label">Readings</div><ul><li>Field, Miles & Field (2012), Ch. 4</li><li>Gries (2013), Ch. 3</li></ul></div></details> <details class="week-item"><summary><span class="week-num-badge">Week 9</span><span class="week-title">Correlation and Simple Regression</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>Pearson and Spearman correlation; simple linear regression; interpreting intercepts and slopes; assumptions and diagnostics.</p><div class="week-section-label">LADAL tutorials</div><div class="week-tutorial-links"><a href="/tutorials/regression/regression.html" class="wtl-link">Regression Analysis</a></div><div class="week-section-label">Readings</div><ul><li>Baayen (2008), Ch. 4</li></ul></div></details> <details class="week-item"><summary><span class="week-num-badge">Week 10</span><span class="week-title">Multiple Regression and Model Diagnostics</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>Multiple regression; multicollinearity; residual analysis; model comparison with AIC/BIC; stepwise and theory-driven model building.</p><div class="week-section-label">LADAL tutorials</div><div class="week-tutorial-links"><a href="/tutorials/regression/regression.html" class="wtl-link">Regression Analysis (advanced)</a></div><div class="week-section-label">Readings</div><ul><li>Winter (2019), Ch. 5</li></ul></div></details> <details class="week-item"><summary><span class="week-num-badge">Week 11</span><span class="week-title">Logistic Regression</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>Binary and ordinal outcomes; logistic regression model fitting and interpretation; odds ratios and predicted probabilities; the proportional odds model.</p><div class="week-section-label">LADAL tutorials</div><div class="week-tutorial-links"><a href="/tutorials/regression/regression.html" class="wtl-link">Regression Analysis</a><a href="/tutorials/surveys/surveys.html" class="wtl-link">Visualising and Analysing Survey Data</a></div><div class="week-section-label">Readings</div><ul><li>Baayen (2008), Ch. 5</li><li>Winter (2019), Ch. 6</li></ul></div></details> <details class="week-item"><summary><span class="week-num-badge">Week 12</span><span class="week-title">Mixed-Effects Models</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>Why mixed effects? Random intercepts and random slopes; by-participant and by-item random effects; fitting and interpreting mixed models with lme4.</p><div class="week-section-label">LADAL tutorials</div><div class="week-tutorial-links"><a href="/tutorials/mixed/mixed.html" class="wtl-link">Mixed-Effects Models</a></div><div class="week-section-label">Readings</div><ul><li>Gries (2013), Ch. 6</li><li>Field, Miles & Field (2012), Ch. 12</li></ul></div></details> </div> <div class="reading-list"> <h5>📚 Core Reading List</h5> <ul> <li>Baayen, R. H. (2008). <em>Analyzing linguistic data.</em> Cambridge University Press.</li> <li>Field, A., Miles, J., & Field, Z. (2012). <em>Discovering statistics using R.</em> Sage.</li> <li>Gries, S. T. (2013). <em>Statistics for linguists.</em> De Gruyter Mouton.</li> <li>Wickham, H., & Grolemund, G. (2016). <em>R for data science.</em> O'Reilly. <a href="https://r4ds.had.co.nz">r4ds.had.co.nz</a></li> <li>Winter, B. (2019). <em>Statistics for linguists: An introduction using R.</em> Routledge.</li> </ul> </div> </div> </div> ``` ### Advanced Statistics in the Humanities and Social Sciences {#advstats-long} ```{=html} <div class="course-block"> <div class="course-block-header dark"> <div class="cbh-info"> <h3>Advanced Statistics in the Humanities and Social Sciences</h3> <p>Multivariate modelling, classification, clustering, and survey data analysis using R</p> </div> <div class="cbh-badges"> <span class="cbh-badge">12 weeks</span> <span class="cbh-badge">Basic stats + R required</span> <span class="cbh-badge free-badge">Free</span> </div> </div> <div class="course-block-meta"> <div class="cbm-item">👥 <span><strong>Audience:</strong> Students and researchers with prior knowledge of basic statistics</span></div> <div class="cbm-item">🕐 <span><strong>Structure:</strong> 1h lecture + 1.5h tutorial per week</span></div> <div class="cbm-item">🎯 <span><strong>Prerequisite:</strong> Familiarity with t-tests, regression, and intermediate R skills</span></div> </div> <div class="course-block-body"> <div class="week-accordion"> <details class="week-item"><summary><span class="week-num-badge">Week 1</span><span class="week-title">Advanced Data Management and Reproducible Workflows</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>Organising complex datasets; reproducibility in advanced research; scripting and automating analysis pipelines; version control with Git.</p><div class="week-section-label">LADAL tutorials</div><div class="week-tutorial-links"><a href="/tutorials/reproducibility/reproducibility.html" class="wtl-link">Reproducible Research</a><a href="/tutorials/notebooks/notebooks.html" class="wtl-link">Creating R Notebooks</a></div><div class="week-section-label">Readings</div><ul><li>Flanagan (2025), Ch. 1</li></ul></div></details> <details class="week-item"><summary><span class="week-num-badge">Week 2</span><span class="week-title">Review of Descriptive and Inferential Statistics</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>Quick review of key concepts: distributions, t-tests, correlations, confidence intervals, effect sizes, and power.</p><div class="week-section-label">LADAL tutorials</div><div class="week-tutorial-links"><a href="/tutorials/dstats/dstats.html" class="wtl-link">Descriptive Statistics</a><a href="/tutorials/inferential_stats/inferential_stats.html" class="wtl-link">Basic Inferential Statistics</a></div><div class="week-section-label">Readings</div><ul><li>Field, Miles & Field (2012), Ch. 1–4</li></ul></div></details> <details class="week-item"><summary><span class="week-num-badge">Week 3</span><span class="week-title">Advanced Regression — Multiple and Hierarchical Models</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>Multiple regression; interaction terms; hierarchical (nested) models; mixed-effects models with random intercepts and slopes.</p><div class="week-section-label">LADAL tutorials</div><div class="week-tutorial-links"><a href="/tutorials/regression/regression.html" class="wtl-link">Regression Analysis</a><a href="/tutorials/mixed/mixed.html" class="wtl-link">Mixed-Effects Models</a></div><div class="week-section-label">Readings</div><ul><li>Baayen (2008), Ch. 4–5</li><li>Winter (2019), Ch. 5–6</li></ul></div></details> <details class="week-item"><summary><span class="week-num-badge">Week 4</span><span class="week-title">Logistic Regression and Generalised Linear Models</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>Binary and multinomial outcomes; model fitting and interpretation; goodness-of-fit; GLMs as a unified framework.</p><div class="week-section-label">LADAL tutorials</div><div class="week-tutorial-links"><a href="/tutorials/regression/regression.html" class="wtl-link">Regression Analysis</a></div><div class="week-section-label">Readings</div><ul><li>Winter (2019), Ch. 6</li></ul></div></details> <details class="week-item"><summary><span class="week-num-badge">Week 5</span><span class="week-title">Classification — Decision Trees</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>Decision trees; recursive partitioning; overfitting and pruning; interpreting tree outputs; applications in linguistic classification problems.</p><div class="week-section-label">LADAL tutorials</div><div class="week-tutorial-links"><a href="/tutorials/tree_models/tree_models.html" class="wtl-link">Tree-Based Models</a></div><div class="week-section-label">Readings</div><ul><li>Gries (2013), Ch. 6</li></ul></div></details> <details class="week-item"><summary><span class="week-num-badge">Week 6</span><span class="week-title">Classification — Random Forests and Ensemble Methods</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>Ensemble learning; bagging and boosting; random forests; variable importance; improving prediction accuracy and generalisability.</p><div class="week-section-label">LADAL tutorials</div><div class="week-tutorial-links"><a href="/tutorials/tree_models/tree_models.html" class="wtl-link">Tree-Based Models</a></div><div class="week-section-label">Readings</div><ul><li>James, Witten, Hastie & Tibshirani (2021). <em>An introduction to statistical learning.</em> Ch. 8</li></ul></div></details> <details class="week-item"><summary><span class="week-num-badge">Week 7</span><span class="week-title">Clustering and Correspondence Analysis</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>Unsupervised classification; k-means and hierarchical clustering; choosing the number of clusters; correspondence analysis for categorical data.</p><div class="week-section-label">LADAL tutorials</div><div class="week-tutorial-links"><a href="/tutorials/cluster_analysis/cluster_analysis.html" class="wtl-link">Cluster and Correspondence Analysis</a></div><div class="week-section-label">Readings</div><ul><li>Gries (2013), Ch. 7</li></ul></div></details> <details class="week-item"><summary><span class="week-num-badge">Week 8</span><span class="week-title">Survey and Questionnaire Data Analysis I</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>Preparing survey data; dealing with missing values; Likert scales and their properties; descriptive analysis and visualisation of survey items.</p><div class="week-section-label">LADAL tutorials</div><div class="week-tutorial-links"><a href="/tutorials/surveys/surveys.html" class="wtl-link">Visualising and Analysing Survey Data</a></div><div class="week-section-label">Readings</div><ul><li>Field, Miles & Field (2012), Ch. 10</li><li>Baayen (2008), Ch. 6</li></ul></div></details> <details class="week-item"><summary><span class="week-num-badge">Week 9</span><span class="week-title">Survey and Questionnaire Data Analysis II</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>Reliability (Cronbach's α, McDonald's ω); factor analysis and scale validation; cross-tabulations and chi-square; ordinal regression for Likert outcomes.</p><div class="week-section-label">LADAL tutorials</div><div class="week-tutorial-links"><a href="/tutorials/surveys/surveys.html" class="wtl-link">Visualising and Analysing Survey Data</a></div><div class="week-section-label">Readings</div><ul><li>Field, Miles & Field (2012), Ch. 11</li></ul></div></details> <details class="week-item"><summary><span class="week-num-badge">Week 10</span><span class="week-title">Dimension Reduction and Multivariate Techniques</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>Principal Component Analysis (PCA); multidimensional scaling (MDS); detecting latent variables; applications to linguistic and social science data.</p><div class="week-section-label">LADAL tutorials</div><div class="week-tutorial-links"><a href="/tutorials/dimred/dimred.html" class="wtl-link">Dimension Reduction Methods</a></div><div class="week-section-label">Readings</div><ul><li>Gries (2013), Ch. 8</li></ul></div></details> <details class="week-item"><summary><span class="week-num-badge">Week 11</span><span class="week-title">Model Evaluation, Diagnostics, and Advanced Visualisation</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>Residual analysis and outlier detection; model comparison and selection criteria (AIC, BIC, cross-validation); visualisation for multivariate data.</p><div class="week-section-label">LADAL tutorials</div><div class="week-tutorial-links"><a href="/tutorials/viz/viz.html" class="wtl-link">Data Visualisation with R</a><a href="/tutorials/regression/regression.html" class="wtl-link">Regression Analysis</a></div><div class="week-section-label">Readings</div><ul><li>Winter (2019), Ch. 7</li></ul></div></details> <details class="week-item"><summary><span class="week-num-badge">Week 12</span><span class="week-title">Applications and Student Mini-Projects</span><span class="week-chevron">▾</span></summary><div class="week-body"><div class="week-section-label">Lecture topics</div><p>Integrating advanced methods into humanities and social science research; ethical considerations; communicating complex statistical results; reproducibility revisited.</p><div class="week-section-label">Tutorial</div><p>Student project work applying classification, clustering, and survey analysis to real datasets.</p><div class="week-section-label">Readings</div><ul><li>Baayen (2008), Ch. 7</li><li>Field, Miles & Field (2012), Ch. 12</li></ul></div></details> </div> <div class="reading-list"> <h5>📚 Core Reading List</h5> <ul> <li>Baayen, R. H. (2008). <em>Analyzing linguistic data.</em> Cambridge University Press.</li> <li>Field, A., Miles, J., & Field, Z. (2012). <em>Discovering statistics using R.</em> Sage.</li> <li>Flanagan, J. (2025). Reproducibility in corpus linguistics. <em>International Journal of Corpus Linguistics.</em> <a href="https://doi.org/10.1075/ijcl.24113.fla">doi:10.1075/ijcl.24113.fla</a></li> <li>Gries, S. T. (2013). <em>Statistics for linguists.</em> De Gruyter Mouton.</li> <li>James, G., Witten, D., Hastie, T., & Tibshirani, R. (2021). <em>An introduction to statistical learning.</em> Springer.</li> <li>Winter, B. (2019). <em>Statistics for linguists: An introduction using R.</em> Routledge.</li> </ul> </div> </div> </div> ``` --- ```{=html} <div class="cta-banner"> <div> <h3>Ready to start learning?</h3> <p>All LADAL courses are free, self-paced, and built around reproducible R workflows. No enrolment required — just dive in.</p> </div> <div class="cta-actions"> <a href="/tutorials.html" class="btn-aqua">Browse All Tutorials</a> <a href="/contact.html" class="btn-ghost-white">Contact Us</a> </div> </div> ``` --- [Back to top](#overview) | [Back to HOME](/)